Read our blogs, tips and tutorials
Try our exercises or test your skills
Watch our tutorial videos or shorts
Take a self-paced course
Read our recent newsletters
License our courseware
Book expert consultancy
Buy our publications
Get help in using our site
551 attributed reviews in the last 3 years
Refreshingly small course sizes
Outstandingly good courseware
Whizzy online classrooms
Wise Owl trainers only (no freelancers)
Almost no cancellations
We have genuine integrity
We invoice after training
Review 30+ years of Wise Owl
View our top 100 clients
Search our website
We also send out useful tips in a monthly email newsletter ...
How you can use Words.Net from Aspose to manipulate MS Word documents in C# |
---|
Words.Net is a utility which allows you to build and edit Microsoft Word documents in C# rather than VBA. Should you adopt it? Read this review to find out! |
In this blog
Words.Net from Aspose allows you to read and manipulate the underlying content of a Microsoft Word document using C#, rather than VBA.
It's also staggeringly expensive, costing a minimum of $1,199 for a year's licence (worth bearing in mind before you read on).
We write all our courseware in Word, and have done for many a long year - here's a sample page:
A sample page from a Wise Owl courseware manual.
We now want to make our manuals available online as a searchable PDF. So why not just write a VBA macro to save our Word documents in PDF form?
Problem | Notes |
---|---|
Fidelity | Although we did write a VBA program to combine the chapters in a manual into a single PDF, we found that it contained anomalies (pages didn't always break in the right place, and Wise Owl hints had a habit of jumping to the top of the page). |
Numbering | It proved difficult to get both page numbering and chapter numbering to work correctly. |
Fiddliness of Word | Although I have begun to appreciate Word more over the years - it must have been a remarkably hard program to write - it is extraordinarily hard to understand the intricacies of the Word document object model, and even harder to then manipulate this in VBA. |
Something else was needed - and generating our manuals in C# allows us to compile and print documents through our Intranet.
I don't claim to have done an exhaustive survey of all of the tools available, but the research I did suggested that Words.Net would be the best tool to solve our problem (and so it has proved). You can install Words.Net in Visual Studio through Nuget:
I also tried playing about with Aspose.PDF: great for redacting documents, for example, but also eye-wateringly expensive!
To establish that you have a licence to use Words.Net, you then download the licence files and put them in a folder:
I started using a 30-day evaluation licence, to check Words.Net would solve our problem.
You can then run a couple of lines of C# code to set your licence:
// create a licence to use Words.Net
var licenceWord = new Aspose.Words.License();
licenceWord.SetLicense(webPath +
"//Aspose//Aspose.Words.NET.lic");
To give you a feel of how Words.Net works, here's some sample code we wrote to do some typical things. Opening files is straightforward:
// open this file
var manualChapterWord = new Aspose.Words.Document(
wordPath + FilePath + "//" + ChapterName);
Any Word document consists of a collection of paragraphs (among other things), and each paragraph consists of a collection of runs of text. So the following sample of code loops over all of the paragraphs in a document, and gets the collection of runs of text for each:
// loop over all of the paragraphs in the document, setting chapter and section numbers
var paragraphs = manualChapterWord.GetChildNodes(
NodeType.Paragraph, true);
foreach (Aspose.Words.Paragraph paragraph in paragraphs)
{
// a single paragraph can contain lots of individual runs: avoid this
paragraph.JoinRunsWithSameFormatting();
// get all the bits of text in this paragraph
NodeCollection runs = paragraph.GetChildNodes(NodeType.Run, true);
The JoinRunsWithSameFormatting method should be unnecessary, but isn't (I discovered that a Word paragraph could contain no formatting marks whatsoever, but still consist of two or more separate runs of text).
Changing text is relatively easy - this command would replace some text in a paragraph:
// replace old with new in the document
paragraph.Range.Replace(oldText, newText);
Adding text is more difficult, I found!
I encountered 2 serious problems, each of which I eventually managed to solve. I'm sharing the solutions here in case it helps anyone else!
It's worth stressing that each of these problems is due to the complexity of the Word object model, rather than Words.Net (which tries to make it as easy as possible to play about with the contents of a Word document).
The first problem was that Word paragraphs are sometimes not what you think they are:
In this example the title Choosing an Environment is part of the same paragraph as the contents of the callout shown selected!
The only way I could find to get round this was to loop over all of the runs in a paragraph, choosing which one to process.
The second problem - and the one which cost me most time - was this innocuous set of actions to get the page number for a paragraph:
// get the page number of this paragraph
LayoutCollector layout = new LayoutCollector(manualChapterWord);
currentPageNumber = layout.GetStartPageIndex(paragraph);
It turns out that the action of getting information on the layout of a document wipes out any programming changes you've made to it! I couldn't find this information anywhere, but ended up deducing it. You have been warned!
My experience was that when I searched for help on a problem the Aspose Words.Net forum would usually contain a helpful discussion about it. Typically this would follow two formats. Either:
Question | Aspose answer |
---|---|
I can't do X - can you help me? | Please can you post your document so we can. |
Here's the document. | You can do this in Words.Net: here's some sample code to do it. |
Or less helpfully:
Question | Aspose answer |
---|---|
Can you do Y in Words.Net? | That feature isn't supported - we have reported it, and will try to incorporate it in a future release. |
(1 year later) Have you made any progress on this? | We haven't had time to build this feature into the software yet, but are intending to in the next release. |
(Another year passes) Just wondering if youve made any progress? | We're pleased to say that this feature is now in the new release of Words.Net. |
In general, I've found that I can get answers to my questions by diligently searching through the Aspose Words.Net forum, but anecdotal discussion on forums like Reddit suggest that you can wait a long time for answers.
Be warned, however, that suggested solutions don't take any prisoners: you'll need to know how to program in C# to understand proposed answers!
Would I recommend Words.Net? Yes, if:
For most people, I suspect, one of the many free - but probably inferior- open source tools will probably be a better choice.
Kingsmoor House
Railway Street
GLOSSOP
SK13 2AA
Landmark Offices
99 Bishopsgate
LONDON
EC2M 3XD
Holiday Inn
25 Aytoun Street
MANCHESTER
M1 3AE
© Wise Owl Business Solutions Ltd 2024. All Rights Reserved.