Making a Voice Recorder on Windows Phone
Introduction
There's a popular set of questions that come up in the forums related to working with the microphone, making usable recordings from it, and a few other things. I'm sure the questions will come up again and I thought it would be of value to have an example to which I could refer when these questions came up. I've made this voice memo application, have put it through Marketplace certification, and made the source code available to all who wish to use it. Feel free to use this code in about any manner you want. If you want to use it in your own app I'd appreciate receiving a message just to tell me that you've found the code useful. But if you don't I won't hold it against you. This code is free of obligations. Though I highly discourage you from submitting it back to the Windows Phone Marketplace in unmodified form.

I've decidedly have not yet spent any effort in making this program interface pretty. This post is all about functionality and since I'm giving the code away I didn't want to invest a lot into my graphic artist to only give the image assets away. As mentioned above if you use the code it's up to you to apply your own graphics. Since I've put this program together I plan to make more changes to it next week (top priority: making the program look good!).
What Do I Need?
The only software you need to work with this code is a Windows PC and the Windows Phone Developer Tools from http://developer.windowsphone.com (a free download!). I'm using Visual Studio 2010 Ultimate. But the express edition in the developer tools will work just as well.
Deciding on a Feature Set
Before starting on the application I sat down and listed and prioritized the features that I wanted the application to have. In no particular order some of the things I thought about included the following.
- Ability to Export Recordings
- Save Recordings in WAV format
- Add notes to recording
- Speed up or slow down recordings
- Change Voice
- Combine, Split, and Edit Recordings
- Export as MP3
- Categorize Memos
- Time/Date activated Reminders
As you can see there's a lot of different things that one could add to a voice memo application. It quickly can progress from something simple to something complex. Rather than making the application overly complex I chose a minimum feature set so that I could accomplish a primary goal of actually producing something to deliver that is simple enough such that I don't have a lot of potential places in which bugs could occur. The reduced feature set is as follows
- Save Recordings in WAV format
- Order recordings by date or by name
- Record under lock screen
- Add Notes to recording
This is a simple begining and something on which other features can be added later.
Using XNA classes from a Silverlight Application
There are two types of applications that you can create on Windows Phone; those that make use of the Silverlight for their UI and those that make use of XNA rendering classes for UI. You must exclusively use one type of UI presentation layer or the other. There's no way for you to use Xna rendering classes from a Silverlight application or vice versa. Silverlight offers several controls that can be used for building the application's UI from a designer such as buttons, text boxes, labels, and so on. Within XNA you are responsible for building your own solution for presenting information. So I am using a Silverlight UI for this application.
For recording audio I must make use of the Microphone class from Microsoft.Xna.Framework.Audio. While you can't use XNA rendering classes in a Silverlight application you can use many of the other XNA classes. Use of the audio related XNA classes require that FrameworkDispatcher.Update() be called periodically. Rather than convolude your program logic with a timer calling this function you can make use of an example ApplicationService that Microsoft provides for performing this same function. The class will take care of calling this function for you. The entirity of the class follows.
public class XNAFrameworkDispatcherService : IApplicationService
{
private DispatcherTimer frameworkDispatcherTimer;
public XNAFrameworkDispatcherService()
{
this.frameworkDispatcherTimer = new DispatcherTimer();
this.frameworkDispatcherTimer.Interval = TimeSpan.FromTicks(333333);
this.frameworkDispatcherTimer.Tick += frameworkDispatcherTimer_Tick;
FrameworkDispatcher.Update();
}
void frameworkDispatcherTimer_Tick(object sender, EventArgs e) { FrameworkDispatcher.Update(); }
void IApplicationService.StartService(ApplicationServiceContext context) { this.frameworkDispatcherTimer.Start(); }
void IApplicationService.StopService() { this.frameworkDispatcherTimer.Stop(); }
}
Once the class is declared in your project it needs to be added as application lifetime object. There's more than one way to do this.
But my preferred method is to add it to App.xaml
.
<Application.ApplicationLifetimeObjects>
<!--Required object that handles lifetime events for the application-->
<shell:PhoneApplicationService
Launching="Application_Launching" Closing="Application_Closing"
Activated="Application_Activated" Deactivated="Application_Deactivated"/>
<local:XNAFrameworkDispatcherService />
</Application.ApplicationLifetimeObjects>
Having done this I need not give FrameworkDispatcher.Update
another thought; it will automatically be started when the program starts and automatically
shutdown when the program ends.
Recording Audio with the Microphone Class
There's plenty of examples on the Internet on how to record audio on WP7. Unfortunatly many of them also contain the same bug. Before I present the code on how recording is implemented I want to visually illustrate how recording works so that I can also demonstrate the bug.
The Microphone class records audio in chunks and passes each chunk back to your program while it continues to record a new chunk. To do this the Microphone class has its own memory buffer that it will fill. Let's say you are recording the phrase "The quick brown fox jumped over the lazy dog." For now let's also assume that the Microphone's buffer happens to be able to record one word at a time (things generally don't end up falling so cleanly in real life, but I ask you to temporarily suspend your ability to apply that thought).

You begin speaking the phrase and the microphone's buffer get's filled with the sound of you saying the word "the."

Once the buffer is full it get's passed to the program and the Microphone begins filling a new buffer with the next word being recorded. The program receives the buffer and gets a chance to do something with it. Since this program is for saving and replaying recorded audio clips the program will save the audio chunk and wait for the next chunk to be appended to the previous.

As each chunk is recorded it get's passed off to the program, and the program appends it to the chunks it has already received. The bug that many of the examples online have occurs when the user speaks the last word.

In many of the online examples when the user has said the word dog and has pressed the stop button the program stops receiving further information from the microphone. But the last word hasn't been passed from microphone buffer to the program yet! The end result is the program has received everything except the last word. To avoid this problem what should have occured is that when the user stops the recorder instead of stopping immediatly the program should wait until it has received one more buffer before stopping. In a worst case scenario there may be a few sounds after the end of the sentence that also get recorded but that's better than missing data. One could reduce the amount of extra data that gets captured by reducing the size of the buffer.
Creating the code that does the above is fairly easy. To get an instance
of the Microphone class we can just grab it from Microphone.Current
. When the
microphone is recording it will notify our program that a buffer is ready to be
read by raising a BufferReady event. When this occurs we can grab the buffer
data by calling GetBuffer(byte[] destination)
. For this method we must pass in a
byte array that will receive the data. How big does this buffer need to
be? The microphone class has two other members that will help us identify the
needed size. Microphone.BufferDuration
will let us know how many seconds can be
stored in the Microphone's buffer and the method
Microphone.GetSampleSizeInBytes(Timespan )
will tell us how many bytes are
needed for a recording of a specific length. Bringing the two together the size
of the buffer we need can be found with
Microphone.GetSampleSizeInBytes(Microphone.BufferDuration)
. Once you have an
instance of the Microphone class, have subscribed to the BufferReady event, and
have created the buffer for receiving your data the recording process can be
started by calling Microphone.Start()
.
In the event handler for BufferReady
there are a few hings that need to be done.
When the data is retrieved from the buffer it needs to be accumulated some
where. After the data has been accumulated we need to check to see if a request
to stop recording has been made. If it has then tell the Microphone instance to
stop sending data over using Microphone.Stop()
and perform what ever actions are
required to persist the recording. For accumulating the data I will use
a memory stream and then write it to isolated storage when the recording is
completed. One of my requirements was that audio data would be saved in WAV
format. This requirement is satisfied by
writing a proper wave header before I write out all the bytes that were
received. Rather than expound on how to do that here I refer you to a previous
blog post that I have written on the subject. The code I have for doing all of
the above follows
public void StartRecording()
{
if (_currentMicrophone == null)
{
_currentMicrophone = Microphone.Default;
_currentMicrophone.BufferReady += new EventHandler<EventArgs>(_currentMicrophone_BufferReady);
_audioBuffer = new byte[_currentMicrophone.GetSampleSizeInBytes(_currentMicrophone.BufferDuration)];
_sampleRate = _currentMicrophone.SampleRate;
}
_stopRequested = false;
_currentRecordingStream = new MemoryStream(1048576);
_currentMicrophone.Start();
}
public void RequestStopRecording()
{
_stopRequested = true;
}
void _currentMicrophone_BufferReady(object sender, EventArgs e)
{
_currentMicrophone.GetData(_audioBuffer);
_currentRecordingStream.Write(_audioBuffer,0,_audioBuffer.Length);
if (!_stopRequested)
return;
_currentMicrophone.Stop();
var isoStore = System.IO.IsolatedStorage.IsolatedStorageFile.GetUserStoreForApplication();
using (var targetFile = isoStore.CreateFile(FileName))
{
WaveHeaderWriter.WriteHeader(targetFile, (int)_currentRecordingStream.Length, 1, _sampleRate);
var dataBuffer = _currentRecordingStream.GetBuffer();
targetFile.Write(dataBuffer,0,(int)_currentRecordingStream.Length);
targetFile.Flush();
targetFile.Close();
}
}
Audio Playback
For playing back audio I will make use of the SoundEffect
class. Like the
Microphone class SoundEffect is an Xna audio class and required the
FrameworkDispatcher.Update()
method to be called periodically. There are
two ways I could go about loading the WAVE file. I could either decode the
header myself or let the SoundEffect
class do it. I show manual decoding here
for reference should some one need to make other modifications to the file.
When
instantiating a SoundEffect
through its constructor three items of data are needed; the recorded audio
data, the sample rate, and the number of audio channels in the recording. This
application will only be recording in monoral, not stereo. So there will always
be one audio channel. I could get away with passing AudioChannels.Mono for this
field. But in the future I may add the ability to import recordings (which could
be in stereo) so I'm going to pull this data from the wave header. Likewise I
could also have grabbed the Sample rate from the Microphone class instead of
obtaining it from the wave header. But in the interest of things I'm considering
for the future I will also obtain it from the header. The wave data itself is
everything after the header. Once a SoundEffect
is initialized to play it
I must get a SoundEffectInstance
instance and then call its Play method.
I don't think that I need to explain why I wonly want one recording to play at a time. So before playing a new audio clip I check to see if there is an existing one loaded in memory and I stop it.
public void PlayRecording(RecordingDetails source)
{
if(_currentSound!=null)
{
_currentSound.Stop();
_currentSound = null;
}
var isoStore = System.IO.IsolatedStorage.IsolatedStorageFile.GetUserStoreForApplication();
if(isoStore.FileExists(source.FilePath))
{
byte[] fileContents;
using (var fileStream = isoStore.OpenFile(source.FilePath, FileMode.Open))
{
fileContents = new byte[(int) fileStream.Length];
fileStream.Read(fileContents, 0, fileContents.Length);
fileStream.Close();//not really needed, but it makes me feel better.
}
int sampleRate =((fileContents[24] << 0) | (fileContents[25] << 8) |
(fileContents[26] << 16) | (fileContents[27] << 24));
AudioChannels channels = (fileContents[22] == 1) ? AudioChannels.Mono : AudioChannels.Stereo;
var se = new SoundEffect(fileContents, 44, fileContents.Length - 44, sampleRate, channels, 0,
0);
_currentSound = se.CreateInstance();
_currentSound.Play();
}
}
Loading the sound via SoundEffect.FromFile
is simple and straight forward.
public void PlayRecording(RecordingDetails source)
{
SoundEffect se;
if(_currentSound!=null)
{
_currentSound.Stop();
_currentSound = null;
}
var isoStore = System.IO.IsolatedStorage.IsolatedStorageFile.GetUserStoreForApplication();
if(isoStore.FileExists(source.FilePath))
{
byte[] fileContents;
using (var fileStream = isoStore.OpenFile(source.FilePath, FileMode.Open))
{
se = SoundEffect.FromStream(fileStream);
fileStream.Close();//not really needed, but it makes me feel better.
}
_currentSound = se.CreateInstance();
_currentSound.Play();
}
}
Keeping Track of the Recordings
In addition to keeping the recordings in isolated storage I wanted to keep track of some other things such as the date the recording was made, a title for the recording, and notes for the recording. It is possible to give the recording a title through the file name or infering a recorded date from the date on the file, but that solution just doesn't seem durable; there are constraints on the characters that can appear in a file name and in the future when I add the ability to import and export files there could be loss of file dates. Instead I've made a class that will hold all the information I want to track on a recording. A simplified view of the class follows.
public class RecordingDetails
{
public string Title { get; set; }
public string Details { get; set; }
public DateTime TimeStamp { get; set; }
public string FilePath { get; set; }
public string SourcePath { get; set; }
}
I gave a simplified view in the interest of keeping the class easy to read. This class needs to be serializable so that I can read and write it from
isolated storage. So the class is decorated with the [DataContract]
attribute and the properties are decorated with the [DataMember]
attribute. I also
plan to bind instances of this class to UI elements. so this class needs to
implement the INotifyPropertyChanged interface. The version of this class
follows. It isn't as much typing as it looks. I use
Visual Studio Snippets to automate generation of part of the code.
[DataContract]
public class RecordingDetails: INotifyPropertyChanged
{
// Title - generated from ObservableField snippet - Joel Ivory Johnson
private string _title;
[DataMember]
public string Title
{
get { return _title; }
set
{
if (_title != value)
{
_title = value;
OnPropertyChanged("Title");
}
}
}
//-----
// Details - generated from ObservableField snippet - Joel Ivory Johnson
private string _details;
[DataMember]
public string Details
{
get { return _details; }
set
{
if (_details != value)
{
_details = value;
OnPropertyChanged("Details");
}
}
}
//-----
// FilePath - generated from ObservableField snippet - Joel Ivory Johnson
private string _filePath;
[DataMember]
public string FilePath
{
get { return _filePath; }
set
{
if (_filePath != value)
{
_filePath = value;
OnPropertyChanged("FilePath");
}
}
}
//-----
// TimeStamp - generated from ObservableField snippet - Joel Ivory Johnson
private DateTime _timeStamp;
[DataMember]
public DateTime TimeStamp
{
get { return _timeStamp; }
set
{
if (_timeStamp != value)
{
_timeStamp = value;
OnPropertyChanged("TimeStamp");
}
}
}
//-----
// SourceFileName - generated from ObservableField snippet - Joel Ivory Johnson
private string _sourceFileName;
[IgnoreDataMember]
public string SourceFileName
{
get { return _sourceFileName; }
set
{
if (_sourceFileName != value)
{
_sourceFileName = value;
OnPropertyChanged("SourceFileName");
}
}
}
//-----
// IsNew - generated from ObservableField snippet - Joel Ivory Johnson
private bool _isNew = false;
[IgnoreDataMember]
public bool IsNew
{
get { return _isNew; }
set
{
if (_isNew != value)
{
_isNew = value;
OnPropertyChanged("IsNew");
}
}
}
//-----
// IsDirty - generated from ObservableField snippet - Joel Ivory Johnson
private bool _isDirty = false;
[IgnoreDataMember]
public bool IsDirty
{
get { return _isDirty; }
set
{
if (_isDirty != value)
{
_isDirty = value;
OnPropertyChanged("IsDirty");
}
}
}
//-----
public void Copy(RecordingDetails source)
{
this.Details = source.Details;
this.FilePath = source.FilePath;
this.SourceFileName = source.SourceFileName;
this.TimeStamp = source.TimeStamp;
this.Title = source.Title;
}
public event PropertyChangedEventHandler PropertyChanged;
protected void OnPropertyChanged(string propertyName)
{
if (PropertyChanged != null)
{
PropertyChanged(this, new PropertyChangedEventArgs(propertyName));
}
}
}
The [DataMember]
attribute spread throughout the code is so that I can use data
contract serialization to read and write this class.
Since I'm using the DataContractSerializer
I don't have to concern myself much
with the specifics of how this file will be encoded when it is saved and loaded.
While using isolated storage isn't hard I'm using a variant of a utility class
from a
previous blog entry to implify serialization and deserialization to a
few lines of code. When the user creates a new recording a new instance of this
class is also created. In addition to the title, notes, and time stamp this
class also contains a path to the recording that is describes and contains a
nonserialized member SourceFileName that contains the name of the original file
from which this data had been loaded. Without that information if the user
decides to update the data there is no way to know to what file should be
overwritten when the content is saved.
//Saving Data
var myDataSaver = new DataSaver<RecordingDetails>() {};
myDataSaver.SaveMyData(LastSelectedRecording, LastSelectedRecording.SourceFileName);
//Loading Data
var myDataSaver = new DataSaver<RecordingDetails>();
var item = myDataSaver.LoadMyData(LastSelectedRecording.SourceFileName);
With that you have all the information that's needed to perform recording, save
the recordings, and load the recordings. When the program first starts I
have it load all of the RecordingDetails
that
and add them to an ObservableCollection
on my view model. From there they can be bound to a list displayed to the user.
public void LoadData()
{
var isoStore = System.IO.IsolatedStorage.IsolatedStorageFile.GetUserStoreForApplication();
var recordingList = isoStore.GetFileNames("data/*.xml");
var myDataSaver = new DataSaver<RecordingDetails>();
Items.Clear();
foreach (var desc in recordingList.Select(item =>
{
var result =myDataSaver.LoadMyData(String.Format("data/{0}", item));
result.SourceFileName = String.Format("data/{0}", item);
return result;
}))
{
Items.Add(desc);
}
this.IsDataLoaded = true;
}
Saving State and Tombstoning
Your program can be interrupted at any time by something like an incoming call or the user breaking away to do a search or some other action. When this happens your application will get Tombstoned; the OS will save which page the user was on and will give your program a chance to save other data. When the program is reloaded the developer must ensure that steps are taken to properly reload state. For the most part I didn't need to worry about tombstoning because most of the state data for the program is promptly persisted to isolated storage. And there's not much state data to be saved; recordings are immediatly commited along with changes to the program's settings. If you want to learn more about tombstoning I highly suggest that this not be your resource for exploring it.
So You Can Record and Playback. Now What?
There's plenty of memo recorders in the Marketplace. What is the purpose of making another? There are other sound related applications that can make use of the functionality implemented within this code. A voice memo recorder is not my end goal. My end goal actually isn't singlular, there are a lot of applications that can be derrived from this code. Right now the source code phylogeny that I expect to result from this application is below.
Preparing For Certification
Certification can take anywhere from a few hours to a few days. The minimal set of files you need when preparing an application for certification are the XAP containing your application (remember to do a release build!), at least one screen shot, and a few image icons in various sizes (200x200, 173x173, and 99x99 pixels). I won't cover the certification process here but will detail it in a later article. While you are waiting for certification you may want to pass time by preparing a promotional page on your website. There's a standard set of images for refering some one to the Marketplace. You can grab the images from here and they come in various sizes, colors, and languages.
After your application passes certification you'll be able to see the direct link to your app. In the case of this application it is http://social.zune.net/redirect?type=phoneApp&id=268c6119-d755-e011-854c-00237de2db9e. Combined with the image I've got a reconizable download link that I could put on a promotional page

What's Next?
I've put this code out as an example only. From here I'll improve upon my own version of this application and I probably won't be updating the version in this article beyond minor bug fixes.
History
- 2010 March 31 - Initial Publication