Home and Learn: Intermediate Programming
Text to Speech: For C# and VB NET Students
We've opened text files and PDF files in previous lessons, and had our Speech Program read them out. We'll open up a Word file in this lesson, grab the text, and have our Synthesizer read it out.
To open up a Word file, you need to add a reference to the Microsoft Office Library. We did something similar for the Excel Charts project.
Right-click on Reference again in the Solution Explorer and select Add Reference. From the Reference Manager dialog box, click on COM on the left. Scroll down and locate the Microsoft Word Object Library. Check the box for whatever is the latest version you have, 16.0 in the image below:
Click OK to add the reference. You should see items called Microsoft.Office.Core and Microsoft.Office.Interop.Word appear in your list of references:
In your coding window, for C# coders, add this using statement to the top of your code:
using Microsoft.Office.Interop.Word;
VB Net coders add a new Imports line at the top. Add this:
Imports Microsoft.Office.Interop.Word
We can open up Word now and read the text of a document.
Add a new Sub/method to your code. In C# add this method:
private void GetWorDFile(string filePath)
{
}
And add this Sub in VB Net:
Private Sub GetWorDFile(filePath As String)
End Sub
As the first line of code for this new Sub/method, clear the text box with this line (delete the semicolon on the end in VB):
txtSpeechText.Text = "";
For the next line, we need to create a new Word object. In C#, you need to add this very long line:
Microsoft.Office.Interop.Word.Application word = new Microsoft.Office.Interop.Word.Application();
In VB Net, however, you can just add this shorter version of the C# line:
Dim word As Application = New Application()
The reason why you can't add a shorter line in C# is because of a clash of Namespaces. If you tried to add this:
Application word = new Application()
You'd get red underlined for Application. The error is because there's a clash between Word.Application and Windows.Forms.Application. C# doesn't know which one it's supposed to use. Seems strange that this is not an error in VB!
We need to create a new document object now. This object will be created from the Word document that you open from the file dialog box. In C#, add this line:
Document doc = word.Documents.Open(filePath);
And this one in VB Net:
Dim doc As Document = word.Documents.Open(filePath)
At this stage, you could just dump the entire file into the text box with this line:
txtSpeechText.Text = doc.Content.Text.ToString();
There is a problem with this, however. It does work, but examine the image below:
The gaps in the text are where the paragraphs are. It's kept all of the indents.
If you want to delete all of the indents then you can count the paragraphs in the document, then loop round and trim the text.
To do this, add this line in C#:
int paras = doc.Paragraphs.Count;
And this one in VB Net:
Dim paras As Integer = doc.Paragraphs.Count
We can use the paras variable to loop round and grab each paragraph. Add this loop to your code in C#:
for (int i = 1; i <= paras; i++)
{
}
And this in VB Net:
For i = 1 To paras
Next
Here's the first line of code to place in your loop:
C#:
string temp = doc.Paragraphs[i ].Range.Text.Trim();
VB Net:
Dim temp As String = doc.Paragraphs(i).Range.Text.Trim()
The Paragraphs part after the equal sign is a property that holds how many paragraphs are in the documents. This is an array you can use your loop counter on. The loop counter, i, starts at 1 because Word starts counting Paragraphs at 1. You then need Range.Text.Trim() to get rid of the indents.
The final line of the loop to add puts the newly trimmed paragraph into the text box:
C#
txtSpeechText.Text += temp + "\r\n";
VB
txtSpeechText.Text += temp + vbNewLine
If you did the section on Excel charts you'll know that COM objects like Word need to be cleaned up properly. Add the following using statement to the top of your code in
C#
using System.Runtime.InteropServices;
And this Imports statement to the top of your code in VB Net:
Imports System.Runtime.InteropServices
Back in your GetWorDFile Sub/method, and just after the loop, add this cleanup code in C#:
if (doc != null)
{
doc.Close();
Marshal.ReleaseComObject(doc);
}
if (word != null)
{
word.Quit();
Marshal.ReleaseComObject(word);
}
And this in VB Net:
If doc IsNot Nothing Then
doc.Close()
Marshal.ReleaseComObject(doc)
End If
If word IsNot Nothing Then
word.Quit()
Marshal.ReleaseComObject(word)
End If
Now add the calling line to the if statement of your Open File button. When you're done, your code should look like this in C#:
And this in VB Net:
If you were to run the code now, the text in the text box would look like this:
All those indents are gone!
Have a go for yourself, though. Open up a Word file on your computer. Select a voice from the dropdown list and click your Speak button. Your Word file will be read out to you.
The next thing we'll do is to get that button working, Pronounce Highlighted Word.
Back to the Intermediate Programming Contents Page
Email us: enquiry at homeandlearn.co.uk