Apache OpenOffice (AOO) Bugzilla – Issue 4568
Word count (task)
Last modified: 2013-08-07 14:38:26 UTC
Anybody used to Word--or, really most other word processors--isn't going to think to look under "Properties" to find this option. Any chance this could be moved to the Tools menu? Also, as a guy who writes for a living, I'd find this command much more useful it worked on selected text as well as the entire document. (Or, better yet, if it could give me a word count on both the selected text as well as the entire document; that way I wouldn't have to worry about having one word selected by mistake, which happens all the time in Word.)
Reassigned to Christian.
My 2 cents worth: I think counting words in a selection has been asked for before. One might consider something similar to sums in Calc: If you select cells in a Calc table, it will automatically display the sum of the selected cells in the lower right corner. In the same way, one might display the word count of the current selection in Writer. The question is if this could be implemented fast enough, and whether it's worth the bother. Regarding speed, this only makes sense if someone with a 500 page document presses Ctrl-A, and still gets the result almost instantaniously. To do this, one would probably have to implement some sort of word-count cache at the paragraph level. Which brings up the question how much effort the feature is worth...
I have received a private mail from the submitter. Two comments: 1) He's probably right that word-count-as-selection may not be that useful. 2) Maybe this would be the job for a macro. 3) The screenshot is attached to this bug. Here is the mail: --------------------------------------------------------- First of all, thanks for replying. I appreciate knowing that my comments haven't gone down a bit bucket somewhere :) Second of all, I think we may have different ideas of how this feature would/could/should work. I am *not* interested in an always-on, "live" word count; that just gets distracting. What I do want is, essentially, the equivalent of the word count in the Mac word processor Nisus Writer, but stripped of all the grammar metrics that I never pay attention to anyway (see screen shot attached). But if it's too much work to count both selected text and all text, I would certainly be content with an equivalent of what AbiWord offers, where it counts either the selected text or, if nothing's selected, the whole document. (My quibble with this setup, though, is that it's too easy to leave one character or word selected by mistake; a word count on both selected and all text eliminates that problem). Both of these programs, I should add, can generate a word count nearly instantaneously--at least on the 200-to10,000-word documents I usually edit. If, OTOH, the standard is going to be "almost instantaneously" in 500-page documents, it seems to me that a lot of other features ought to be dropped from OpenOffice. My last argument here: However slow a count of selected text might run, it will still be faster than the current alternative--copying the text you have in mind, pasting it into another window and then running a word count on that. I hope I can be persuasive on this... pls. let me know if I'm not clear on any of these points. Thanks, Rob
Created attachment 1664 [details] Another program's word count dialog
looks very similar (duplicate?) to bug 1793
A very quick search came up with a macro which, if works as advertised will do word count of slected text - sorry, I don't have an OOo build to hand at the moment. Anyway, could be a stop-gap for students and writers (I fall into the formaer and will definately find it useful) until something more built in can be done. Anyway its at http://www.darwinwars.com/lunatic/bugs/oo_macros.html#swc David
Created attachment 1744 [details] OOo macro that counts words in a selection
dvo->illsleydc: Thanks for the macro. I've changed the macro David found. It now displays word + character count for selection and the document. Also, the macro seems to sometimes skip the first or last word in a selection; I fixed this too. dvo->robpegoraro: Please evaluate whether this meets your needs. You can add the macro to your installation through the tools->macro dialog. Through tools->customize, you can assign it to a key-combination, so that e.g. Ctrl-C would pop up the count dialog. Please report whether this is usable or not; if it's usable, I'd like to get this included in the FAQ, since this has been asked rather often. dvo->davidfraser: Thanks for collection the various word count bugs. I knew this had been asked for before, but not how often... :-) dvo->davidfraser: David, thanks
After trying this macro out for the past week, I've found that it overestimates things in longer selections. Here is one example, using the installation instructions at http://www.openoffice.org/dev_docs/instructions.html (starting with "The Windows version of OpenOffice.org 1.0" and ending with "Founder, OOoDocs"): The entire text measures 894 words in the Properties:Statistics dialog (890 in MS Word, FWIW). But selecting the full text and running the macro gets a different result: 937 words. It also over- counts the number of characters in the document--5,541, versus 5,417 as reported in Properties:Statistics. This margin of error seems to be smaller on shorter selections. If I select the first paragraph from that sample text, for instance, the macro reports 86 words, just one more than OpenOffice reports in Properties:Statistics when the paragraph is pasted into another window.
dvo->robpegoraro: Ah, I see. Two differences I'm aware of off-hand are: 1) The statistics character count does not include paragraph end marks, but the macro count does. 2) I believe text in text fields is counted differently. I'll have another look at it when I have some time this week.
I've updated the macro, and now it counts the same for the given example. The differences are: 1) the paragraph-end markers were previously counted as two characters in the macro, but not at all in the document statistics 2) the word delimiters previously were space " ", braces "(" ")", and tabs in the macro, but space, tabs, and punctuation "." "," ";" "-". In both cases, I adapted the macro to match the document statistics. Note 1: This is not a matter of right or wrong, but rather of what you want. Is "www.openoffice.org" one word, or three? You can adjust this in the document statistics, and one can also adjust this in the macro, if desired. Note 2: Fields are still being counted differently. Note 3: I bet the differences between Word and OOo can be accounted for different what-is-a-word definitions, too.
Created attachment 1856 [details] Updated word count macro
FWIW there appears to be some kind of an effort in standardising what a word count should mean. It inculdes someone from Sun but doesn't appear to be moving at all and anyway I find it hard to believe that MS will ever change their algorithm bearing in mind the hassel caused. So.. should we mimic them, or do something else... For the something else, is there any kind of written-up de-facto way of doing this, robpegoraro what does your boss condier as a word? Also, as an aside, do i18n people use different separators etc?
Oops, missed the URL of the standards people, sorry. http://lisa.org/oscar/seg/
Internationalization does indeed use different separators. For all I know, some languages (like Japanese or Chinese) don't have seperators, but rather have certain characters always constitute a word, regardless of what preceeds or follows them. We use the com::sun::star::i18n::BreakIterator service to handle this internally. (I.e., this determines how the cursor moves when you press Ctrl-Left/Right-Arrow) This mechanism is also accessible through the API, hence a more sophisticated version of the word count macro could use it and would then always count correctly for any language. I personally was looking for a quick solution though. I'm happy to help anyone else who wants to spend the time. My changes have adapted the macro's behaviour to the OOo's statistics dialog behaviour, which should address the points that Rob raised.
I can confirm that the new macro does seem to work as designed--I'm no longer seeing discrepancies in the results I get with the macro and the regular Properties:Statistics option. However, I did notice one other thing: The macro doesn't appear to register non-contiguous selections of text. After using Ctrl-click to highlight multiple blocks of text, the macro reports a word count of zero for the selected text.
Created attachment 1907 [details] WordCount macro, with support for multiple selection
Ahh, adding support for multiple selections is easy enough, so I included an updated version of the macro.
*** Issue 9533 has been marked as a duplicate of this issue. ***
*** Issue 5073 has been marked as a duplicate of this issue. ***
Reassiged to Bettina.
*** Issue 11265 has been marked as a duplicate of this issue. ***
*** Issue 12022 has been marked as a duplicate of this issue. ***
*** Issue 13361 has been marked as a duplicate of this issue. ***
*** Issue 13479 has been marked as a duplicate of this issue. ***
*** Issue 14644 has been marked as a duplicate of this issue. ***
*** Issue 12334 has been marked as a duplicate of this issue. ***
*** Issue 9572 has been marked as a duplicate of this issue. ***
I have unmarked 14644 as a dsuplicate of this issue, since it is about the ay that the word count in document properties is inaccurate (and I believe that's a regression in 1.1 beta) and not about accessing or extending the document properties word count, which is what this dicussion covered.
*** Issue 14645 has been marked as a duplicate of this issue. ***
*** Issue 15429 has been marked as a duplicate of this issue. ***
*** Issue 15526 has been marked as a duplicate of this issue. ***
*** Issue 15398 has been marked as a duplicate of this issue. ***
Need to add bug 1793 and bug 3155 as duplicates of this.
*** Issue 14063 has been marked as a duplicate of this issue. ***
FYI, the macro from last June works in 1.lRC2. But... my point in filing this bug report was to see this feature eventually added to the writer application, not left as an optional add-on for users who can navigate the macro dialog box. Am I correct in assuming that this feature didn't make the cut for 1.1? If so, any particular reason why?
Michael Meeks is hosting a patch against 1.1 that adds Word Count to the tools menu. He hasn't to my knowledge submitted it to IZ yet for various reasons, but it is there and it works. http://ooo.ximian.com/patches/RC3/word-count.diff Dan
*** Issue 20246 has been marked as a duplicate of this issue. ***
Platform/OS to ALL, target-milestone: not determined...
*** Issue 5995 has been marked as a duplicate of this issue. ***
*** Issue 19128 has been marked as a duplicate of this issue. ***
*** Issue 22313 has been marked as a duplicate of this issue. ***
the ximian patch has moved, it is now at: http://ooo.ximian.com/patches/OOO_1_1/word-count.diff Thanks a lot for the macro (I'm a novice user so it took a long time to get it to work, but I did, and I set it to work with <ctrl>+w, man, cool. I know the darwinwars guys did most of the work but, it is still cool. And if there is any way this macro could get integrated into the code (but not as a macro of course), than it would probably satisfy everyone that keeps asking for a better word count. I give it an A+, exactly what I wanted. John
*** Issue 23912 has been marked as a duplicate of this issue. ***
In response to: > --- Additional comments from David Illsley Tue Jun 4 05:22:20 -0800 2002 > > FWIW there appears to be some kind of an effort in standardising what > a word count should mean. It inculdes someone from Sun but doesn't > appear to be moving at all and anyway I find it hard to believe that > MS will ever change their algorithm bearing in mind the hassel caused. > > So.. should we mimic them, or do something else... David --> I can see *all* sorts of problems trying to explain to the PHBs of the world why OOo and Word have different counts, and none of the images in my head are pleasant. Like it or not, MS Word has been the de facto standard. I don't think word count behavior should be changed away from Word's without some very compelling (and PHB-understandable) reason. I make my living on word counts as a translator, and I cannot afford to have clients accusing me of padding my bills when the numbers don't match. Nor would I be happy to realize I'd shorted myself if OOo's count was low compared to Word's. Just my two yen.
In response to: > ------- Additional comments from dvo Tue Jun 4 23:58:17 -0800 2002 ------- > > Internationalization does indeed use different separators. For all I > know, some languages (like Japanese or Chinese) don't have seperators, > but rather have certain characters always constitute a word, > regardless of what preceeds or follows them. We use the > com::sun::star::i18n::BreakIterator service to handle this internally. > (I.e., this determines how the cursor moves when you press > Ctrl-Left/Right-Arrow) dvo --> I translate from Japanese to English, and have studied some Chinese and Korean. The whole concept of "word" simply doesn't exist as folks used to Indo-European languages would recognize it. Certain characters don't always constitute a word, though certain combinations sometimes do. Those dealing with writing as authors, teachers, editors, translators, etc use character counts, not including spaces. I note that OOo only offers up character counts with spaces included (please see issue # 10356 where I've attached a file [soon to be two] showing some of the differences). This causes problems for CJK languages given the way these have traditionally been counted. Finding some means of parsing CJK languages to use CTRL-Left and CTRL-Right becomes problematic; other than having a complex dictionary and grammatical map combination of some sort, I'm not sure how else it would work (but then I'm not much of a coder ;). Thing is, there's some disagreement in linguistic circles as to what constitutes a "word" in Japanese due to the way it agglutinates -- for example, do particles (similar to prepositions and articles in English) count as separate words, or as suffixes to the nouns they follow? I'll attach a sample of Japanese here to illustrate the problem.
Created attachment 12806 [details] Example of faulty OOo navigation by "word" in Japanese (CTRL-Left/Right)
With all these duplicate issues - don't you think that someone should consider putting them all together. I mean the # of dups is greater than the number of votes! Let's face it, burying the word count in the properties and then the lack of selection count is not good. The current Word Count scenario stinks. I'd really like to see Word Count have a better place and usefulness in OOo 2.0.
Since my comments have been lost among the mass of "duplicates", I'll add it here. What is needed is a full-featured word count. A user needs to select words (not spaces), footnotes or not, bibliography entries or not, and a selection or entire document. That would solve most complaints about word count I have seen. This is an OLD issue. So far what we have is a user-supplied and modest word count macro. What we need is an integrated word count option. I am amazed that no one is working on this at this late stage. After two years and an enormous number of requests for this feature, it is still marked as "NEW". We need a new category "ANCIENT" to refer to filed issues with lots of requests but never assigned to a developer.
removing 23974 from the list of "depends on" issues since it's a duplicate of issue 14050
When a student writes an essay, or a content-provider writes an article, they firstly and foremostly want to know how their assigned word limit is being used up. This tells them where they need to trim their text. It's vital information. In MS Word, in consequence, the selective word count is used *dozens* of times per day to check the proportional sizes of text sections, so that they can make the best use of their space. The absence of an intuitive, built-in word count is the single greatest barrier that I have personally found to OpenOffice adoption amongst people that I have recommended it to as an MS Word replacement. OpenOffice Writer simply *does not count words*, at least not in the way that MS Word users need and expect it to. For a basic function, performed dozens of times a day, they shouldn't have to install version-sensitive macros, configure keybindings, or open new documents for the sole purpose of pasting text to count words. As I said, students and other writers that I have recommended OO to, have pointed to the absence of selective word counting as THE deciding factor in sticking with MS Word. It has been years now that this bug has persisted on this forum, with numerous worthwhile implementation suggestions, without being resolved. There have been thirty-odd duplicate bugs reporting this omission. If I felt my C was up to scratch I would gladly code this myself. As it is, can I at least note that other open-source projects have implemented perfectly functional alogrithms to this end, and their code is publicly available? Personally, I would love to see two extra columns in the navigator showing word count by section, both absolute and in percentage terms (and Flesch readbility stats would be fabulous too, though I'd leave them in Statistics). But all these features that could easily surpass Word's counting system are nowhere near as important as simply getting *some* kind of solution in place to address this fundamental usability issue.
Please fix this. I'm in graduate school doing a humanities subject and it's incredibly annoying not to be able to check selections of text. It can't take that long to fix the thing, and it would make a huge difference to a LOT of your users.
I quite agree. I made the jump to OO quite recently and am mostly getting along fine with it; but was completely astonished to discover how primitive and unflexible the word count feature. As a law student working to very tight word limits knowing how many words are in my footnotes and titles and so on is absolutely crucial. It's all very well to install a macro but if I have work due in on a Monday the last thing I want to be doing is spending my time fiddling around with macros!! Selective word count is surely an obvious thing to put into a later build. FWIW it should be more obvious too; I can't see that popping the word count or stats feature into the tools menu would cause that much confusion for long-time users, it's as much as learning a new ALT-x-x key combo. But a properly working word count is the priority!
related: Issue 24038 Flesch-Kincaid Grade Level readability statistics (enhancement)
Counting Words in a selection wil be implemented in OO.o 2.0. Please have a look at http://specs.openoffice.org/writer/wordcount/Enhanced_Wordcount.sxw.
reopening issue. THere's more to be improved than only word-count in selection. Counting in selection is 17964 But there are at least 10356 (esp. important for asian languages) 14050 (journalists, science) and 19692 still to be done. I change the issuetype to task to reflect that this is not a issue with implemention details, but a issue for collecting and referencing issues related to word count.
Ok, I reset the target to office later, as there is no resource for considering more cases concerning word count in OO.o 2.0.
I CANNOT believe you are going to miss this vital feature out of OO.o 2.0. All I can say is you guys really know how to shoot yourselves in the foot. Instead of messing about with new enhancements could you at least get the basics right? This is one of quite a few basic features that are missing from OO.o that stop me from using this professionally or reccomending it to others in my field of work. I really hope you reconsider because you will miss out on alot of users at school and university level who need word counts for essays or professional writers. Please reconsider.
This is a disaster. Without a decent word count feature OpenOffice is useless for anyone working as a journalist, or as a student in any humanities subject. This is BY FAR the greatest flaw in the software, and should be your first priority.
As far as french and spanish languages are concerned, what a "word" is is not a matter of opinion or a Microsoft standard, but is decided by the Académie française and the Real academia espanola, respectively, in their famous dictionnaries. Some former phrases are regarded as one french word, e.g. "ad hoc". The 8th edition of the Académie française dictionary is freely online : http://atilf.atilf.fr/academie.htm* the 9th edition (from A to négaton) too : http://www.academie-francaise.fr/dictionnaire/ W.W.W. site of the Real academia espanola and their dictionary : http://www.rae.es/
Regarding Asian (double-byte) "word" and character count, I just attached a sample text to Issue 17964 comparing Word's count with OOo's count. I used 2.0 Beta 1, and was most disappointed to find that the only changes from 1.1 were the location (now in Tools -> Word Count) and the minor (but quite welcome) addition that we can now count selections. However, *none* of the remaining word count issues have been addressed in any way visible to the user. Asian text is still counted incorrectly (borking any count of mixed Asian - Latin text as well), and footnotes and endnotes are still not includable (or is that excludable?). I have not looked into more complex issues such as text boxes and the like. Please have a look. This issue is indeed ANCIENT and in dire need of some action.
Seems that issue http://www.openoffice.org/issues/show_bug.cgi?id=41454 would be good to include in this task.
Counting the words and characters in a document and in a selection via Tools / Word Count has been implemented.
Thus issue is closed.
Someone please explain why this is STILL not a standard feature? The problem comes up over and over again, since 20002. The only functional solution isn't even documented here, a script from a guy named Yawar Amin out of Canada. Several of the Writer's Tools are nice, but it does NOT solve the need for a running/live word count, because it still disappears as soon as you continue with the document. (Amin's is floating, on top, and i find plenty of places to put it, BUT... Ther is space all over the GUI to put something like word/char count, and even the percentage of target. I know, its doing a lot, open source, etc., but come on... 8 years, and still nothing? PLEASE, reopen this, give it a target milestone that's meaningful, and let OOo catch up with Microsoft's Word 2007 on this issue? THANKS!