Breaking down a large Word document for sharing

Started by Alexander Deliyannis on 12/1/2009
Alexander Deliyannis 12/1/2009 7:11 pm
The concept of what I am need to do seems simple enough, yet for the life of me I am unable to think of a tool to do it. I am surprised myself that I haven't needed to do something like this before.

Here's the deal. I have a 150+ page document describing a large number of project tasks my team are working on. The document has become too cumbersome to move around, especially since several people may need to update their specific tasks concurrently.

So my idea is to break the document down to small units, each describing a single task. The question is: is there some tool to do this automatically?

Most tasks take up a single page, but some take more. I wouldn't mind doing manually the long ones. I also wouldn't mind if my imaginary tool (for example, a Word macro) needed some kind of separator to mark the start of each new section.

But with all the time and money I have invested in information managers, I hate to think I will have to do the whole process by hand. Also, I am quite certain I will want to do this in the future again, so I would happily invest in a tool that facilitates such tasks; then there's CRIMP as an additional incentive..


Stephen Zeoli 12/1/2009 7:51 pm
Hi, Alexander,

If I correctly interpreting what you are looking for, I think you will find that Zoot can do this just fine. It will work especially well if each of the new extracted documents already start with the same text. For instance, if the start of each task has the text "Task: XXXXX" where "XXXXX" is the variable. If this is the case, just use the file import wizard in Zoot (accessible via the import option of the file menu). After selecting the file you want to import (and it works with .doc files), select "Import delimited file (other)" and set Task: as the extract delimiter. Then pick the database in which you want to import the files. Then determine whether you want to use an existing folder or create a new one to hold the imported items. After you click "finish" a dialog box that allows you to set the extract delimiter (in this case Task:) and a subject delimiter, can be the same delimiter or another. Then click OK. It's like magic! And, if you've got fields that end in a colon, you can now create columns for viewing your information in a table. It's quite brilliant, really.

If you don't have a common delimiter at the front of each of the sections you want to extract, you'll have to go in and add something. Any unusual text string will do... a series of periods or xxxx's, whatever.

Hope that helps.

Steve
Stephen Zeoli 12/1/2009 7:54 pm
BTW, I haven't tried this with Zoot 6 yet, and if you use Zoot 5, of course, you will lose your formatting.

Steve
Alexander Deliyannis 12/1/2009 8:31 pm
Steve, thanks; I very much appreciate the quick reply.

Interestingly I considered Zoot which I own, but immediately rejected it because I need the formatting (which includes tables, images and the like).

However, your post reminded me that Zoot 6 works supports RTF. I have not yet tried the Beta, but coincidentally Tom posted the link in the Zoot forum today, so I might give it a try. (In better times I would have spent the week-end trying out software to do such a repetitive task, but I am under way too much pressure from deadlines at this point).

Thanks again,
Alexander

P.S. To whoever has additional ideas to propose: please keep the ideas coming!
Stephen Zeoli 12/1/2009 9:01 pm
Alexander,

I just tried to do this kind of import in Zoot 6, but apparently the function is not yet activated. There is a file menu item for importing, and it allows you to select a file to import, but nothing happens after that. I am using the latest version of the beta.

It doesn't look like MyInfo can do this either, I'm afraid.

Steve
Stephen Zeoli 12/1/2009 9:45 pm
I just took a look at doing this with AskSam 7. It works and retains much of the formatting, except for tables, unfortunately. So the hunt continues.

Steve
moritz 12/2/2009 5:12 am
4 solutions:
1. use Word in combination with SharePoint like *this* (http://msdn.microsoft.com/en-us/library/cc185135.aspx )
2. wait for Office 2010 (or use the Beta), again: document has to be stored in SharePoint, this will give you "co authoring" with document level change tracking and online/offline collaboration circumventing conventional checkin/checkout
3. consumer version of 2, with the advantage of being "free": use the forthcoming "free" web based version of Office 2010 apps (including Word) together with Office Live/Skydrive. I assume (it hasn't been released yet) that this will also get you "coauthoring".
4. use some other web based solution e.g. Google apps. Downside: you would give up Office compatibility (heavy conversion) and e.g. Google docs has a 500 KB (!) limit per Word document (last time I checked)
moritz 12/2/2009 5:15 am
5. (I missed what might be the most obvious solution): OneNote (2007 and 2010) can share Notebooks in realtime (... and offline) if you have a server side storage location. To make this easier for cross organizational teams again in 2010 there will be a server component (so you don't have to run any client side software)
Chris Thompson 12/2/2009 5:30 am
Why not just use Word's compound document features directly to accomplish this? When you switch into outline mode, there's a toolbar that lets you break out a file into multiple files, at least in Word 2003. This feature was meant for long documents, the idea being that you'd split a book up by chapter, with each chapter being individual Word files but being able to edit them together as one file as well. It's a little clunky, but it seemed to work last time I used it.

-- Chris
clacha 12/2/2009 5:31 am
Hello,

I understand that you want to manage structured documents, for which XML is required but must be hidden to end users.

You may have a look at Scenari : http://scenari-platform.org
An open source suite for the design and use of authoring and publishing chain of structured documents.
We use it in production since one year with positive feedback from authoring end users.

Regards
Alexander Deliyannis 12/2/2009 10:45 am
Steve: thanks; you've saved me a lot of effort by trying out Zoot before I did

Quant: thanks; the macro is very interesting though I cannot fathom how it calculates the size of the excerpts which appears completely unpredictable. I wish I knew vba...

Moritz: thanks for all the ideas;
? 3 looks great for the future (we are not on Sharepoint and contributors are distributed in the globe)
? 4 has the unbelievable 500 kb limitation; we'll be moving to Google Apps soon, so I expect that will not be there
? 5 OneNote will also be ideal with the client side software; I'll check whether anybody else has OneNote (I don't use it, but I have it)

Chris; thanks, I'm trying to figure it out. Probably the best solution this time as everyone uses Word.

Clacha; thanks, I've tried Scenari in the past; not what I'm looking for right now since 80% of the work is ready in Word. However, I like the structured "separate content from layout" approach.

More as it happens, thanks again!
Alexander Deliyannis 12/2/2009 7:01 pm
For your information:

(a) I resorted to manually break down the file; it took less than two hours, and my learning any new program would have probably taken me longer, for a task I am unlikely to do for some time. That said, I am seriously considering learning VBA.

(b) The advantage of keeping things in .doc format is that everyone in the team can edit it. Since no two people will be working at the same tasks at a time, I expect the updating will be manageable.

(c) I put all the files on a server location, so people can download them. They'll have to mail me back the updated versions for me to re-upload, which in my case is an advantage for maintaining the overview.

(d) MS Word's long document management features are clunky indeed, so in order to manage the whole lot of files created I resorted to a freeware program that had been suggested here in the past: ChapterByChapter http://pagesperso-orange.fr/sebastien.berthet/cbc/index.html I had never tried it previously, but I find it brilliant.

Thanks again for all your suggestions. The CRIMP side in me would have wanted a more hi-tech and automated solution and I expect that in the future I will try to foresee one from the start. As it happens, the current document grew out of something much less ambitious, so nobody had considered the possibilities earlier.
Pierre Paul Landry 12/2/2009 9:04 pm
Alexander Deliyannis wrote:
(c) I put all the files on a server location,
so people can download them. They'll have to mail me back the updated versions for me to
re-upload, which in my case is an advantage for maintaining the overview.

You could put the files in a Dropbox folder and share it with co-workers. This make this solution much more manageable (no download, no re-load, no server passwords, etc)

(plus you get 250MB for each new user that registers...)
Alexander Deliyannis 12/2/2009 9:36 pm
Pierre Paul Landry wrote:
You
could put the files in a Dropbox folder and share it with co-workers. This make this
solution much more manageable (no download, no re-load, no server passwords,
etc)

Yes indeed, but here (finally) comes the CRIMP fun of it all: once the tasks are in discrete files there's some wonderful things I can do with them, apart for being able to manage them through Chapter By Chapter.

So I imported the whole folder to MyInfo, and rebuilt the structure within the program. Then I exported it all as a website and uploaded that to the server. Now my co-workers have an online structure where they can :
(a) see the organised tasks straight from their browser and
(b) download the .doc version of each task for their own editing

What I like about MyInfo is that I can simply exchange each task node's .html and .doc files when they are updated without having to go again through the whole process.

Thanks to Steve for suggesting MyInfo in the first place.

Ah, the simple joys of living...
moritz 12/3/2009 12:33 am
Alexander: Forgot to mention that OneNote 2010 will be free using the server side (browser based) app via SkyDrive. if you have a SkyDrive account you can play with the other apps already, OneNote currently says "coming soon". If you have the client app you could share a notebook with others who are using a browser (synchronized in real time, a bit like Google Wave)

? 5 OneNote will also be ideal with the client side software; I'll check whether
anybody else has OneNote (I don't use it, but I have it)

Edwin Yip 12/3/2009 7:47 am
Hi Alexander,

Yes, the current version of Word does not work well with long documents, the upcoming Word 2010 will have a Navigation pane (actually it's the improved version of the current document map), and that will make working with long Word documents easier.

And your requirement proves an idea of mine for Writing Outliner (A Word addin, one of the features is to help you to outline, navigate and edit book-lengthy document in Word easier, like the navigation pane in Word 2010 but it works with old versions of Wrod), that's convert by headings a long Word document to a Writing Outliner project.

--
Edwin Yip
Writing Outliner - A new way to write in Microsoft Word?
http://WritingOutliner.com
Alexander Deliyannis 12/3/2009 6:33 pm
Moritz: you've got me hooked; 'free' is not important for me but it sure is for people like some of my co-workers that have not tried such software and are used to getting functionalities free on the internet. I'll try it out as soon as I get the chance.

Edwin: I've followed the Writing Outliner discussion and will definitely try it once it's out. What these days' exercise showed me is that there's a multitude of excellent tools discussed here that can help one handle information located in several discrete files, but very few can manipulate the inforrmation located in a single large file. The few exeptions mentioned included AskSam which is unreliable and Zoot 6 which is not yet released.

I think that what I would have needed is a contemporary version of Doug Engelbart's Hyperscope http://hyperscope.org/

Pierre-Paul, any chance that InfoQube could handle such smart break-down, de-compose / re-compose tasks?
Pierre Paul Landry 12/3/2009 7:26 pm
Alexander Deliyannis wrote:
Pierre-Paul, any chance that InfoQube could handle such smart break-down, de-compose / re-compose tasks?

Funny you asked... 2 days ago, I was adding a block tagging feature (select text then edit>>Enclose text with...). This was to add a featured requested by users, which is to be able to hide blocks of text when doing an HTML Export. Basically, blocks tagged with a some tags could be hidden to others.

As I was doing this, I thought that another use of this easy block tagging, was to be able to de-compose / re-compose items.

So the answer is yes, it can certainly be added. I'm busy completing the Calendar right now, but will look at it in my spare time.