Delicious Bookmark this on Delicious Share on Facebook SlashdotSlashdot It! Digg! Digg



PHP : Function Reference : String Functions : str_word_count

str_word_count

Return information about words used in a string (PHP 4 >= 4.3.0, PHP 5)
mixed str_word_count ( string string [, int format [, string charlist]] )

Example 2450. A str_word_count() example

<?php

$str
= "Hello fri3nd, you're
      looking          good today!"
;

print_r(str_word_count($str, 1));
print_r(str_word_count($str, 2));
print_r(str_word_count($str, 1, 'àáãç3'));

echo
str_word_count($str);

?>

The above example will output:

Array
(
   [0] => Hello
   [1] => fri
   [2] => nd
   [3] => you're
   [4] => looking
   [5] => good
   [6] => today
)

Array
(
   [0] => Hello
   [6] => fri
   [10] => nd
   [14] => you're
   [29] => looking
   [46] => good
   [51] => today
)

Array
(
   [0] => Hello
   [1] => fri3nd
   [2] => you're
   [3] => looking
   [4] => good
   [5] => today
)

7

Related Examples ( Source code ) » str_word_count



Code Examples / Notes » str_word_count

megat

[Ed: You'd probably want to use regular expressions if this was the case --alindeman @ php.net]
Consider what will happen in some of the above suggestions when a person puts more than one space between words. That's why it's not sufficient just to explode the string.


webmaster

Trying to make an effiecient word splitter, and "paragraph limiter", eg, limit item text to 100, or 200 words and so-forth.
I don't know how well this compares, but it works nicely.
function trim_text($string, $word_count=100)
{
$trimmed = "";
$string = preg_replace("/\040+/"," ", trim($string));
$stringc = explode(" ",$string);
echo sizeof($stringc);
if($word_count >= sizeof($stringc))
{
// nothing to do, our string is smaller than the limit.
 return $string;
}
elseif($word_count < sizeof($stringc))
{
// trim the string to the word count
for($i=0;$i<$word_count;$i++)
{
$trimmed .= $stringc[$i]." ";
}

if(substr($trimmed, strlen(trim($trimmed))-1, 1) == '.')
 return trim($trimmed).'..';
else
 return trim($trimmed).'...';
}
}
$text = "some  test          text goes in here, I'm not sure, but ok.";
echo trim_text($text,5);


geertdd

This is an update to my previously posted word_limiter() function. The regex is even more optimized now. Just replace the preg_match line. Change to:
<?php
preg_match('/^\s*(?:\S+\s*){1,'. (int) $limit .'}/', $str, $matches);


aidan

This functionality is now implemented in the PEAR package PHP_Compat.
More information about using this function without upgrading your version of PHP can be found on the below link:
http://pear.php.net/package/PHP_Compat


16-jan-2005 02:38

This function seems to view numbers as whitespace. I.e. a word consisting of numbers only won't be counted.

muz1

This function is awesome however I needed to display the first 100 words of a string. I am submitting this as a possible solution but also to get feedback as to whether it is the most efficient way of doing it.
<?
$currString = explode(" ", $string);
for ($wordCounter=0; $wordCounter<100; $wordCounter++) { echo $currString[$wordCounter]." "; }
?>


brettnospam

This example may not be pretty, but It proves accurate:
<?php
//count words
$words_to_count = strip_tags($body);
$pattern = "/[^(\w|\d|\'|\"|\.|\!|\?|;|,|\\|\/|\-\-|:|\&|@)]+/";
$words_to_count = preg_replace ($pattern, " ", $words_to_count);
$words_to_count = trim($words_to_count);
$total_words = count(explode(" ",$words_to_count));
?>
Hope I didn't miss any punctuation. ;-)


rabin

There is a small bug in the "trim_text" function by "webmaster at joshstmarie dot com" below. If the string's word count is lesser than or equal to $truncation, that function will cut off the last word in the string.
[EDITOR'S NOTE: above referenced note has been removed]
This fixes the problem:
<?php
function trim_text_fixed($string, $truncation = 250) {
   $matches = preg_split("/\s+/", $string, $truncation + 1);
   $sz = count($matches);
   if ( $sz > $truncation ) {
       unset($matches[$sz-1]);
       return implode(' ',$matches);
   }
   return $string;
}
?>


philip

Some ask not just split on ' ', well, it's because simply exploding on a ' ' isn't fully accurate.  Words can be separated by tabs, newlines, double spaces, etc.  This is why people tend to seperate on all whitespace with regular expressions.

aix

One function.
<?php
if (!function_exists('word_count')) {
function word_count($str,$n = "0"){
$m=strlen($str)/2;
$a=1;
while ($a<$m) {
$str=str_replace("  "," ",$str);
$a++;
}
$b = explode(" ", $str);
$i = 0;
foreach ($b as $v) {
   $i++;
}
if ($n==1) return $b;
else  return $i;
}
}
$str="Tere Tartu linn";
$c  = word_count($str,1); // it return an array
$d  = word_count($str); // it return int - how many words was in text
print_r($c);
echo $d;
?>


kirils solovjovs

Nothing of this worked for me. I think countwords() is very encoding dependent. This is the code for win1257. For other layots you just need to redefine the ranges of letters...
<?php
function countwords($text){
       $ls=0;//was it a whitespace?
       $cc33=0;//counter
       for($i=0;$i<strlen($text);$i++){
               $spstat=false; //is it a number or a letter?
               $ot=ord($text[$i]);
               if( (($ot>=48) && ($ot<=57)) ||  (($ot>=97) && ($ot<=122)) || (($ot>=65) && ($ot<=90)) || ($ot==170) ||
               (($ot>=192) && ($ot<=214)) || (($ot>=216) && ($ot<=246)) || (($ot>=248) && ($ot<=254))  )$spstat=true;
               if(($ls==0)&&($spstat)){
                       $ls=1;
                       $cc33++;
               }
               if(!$spstat)$ls=0;
       }
       return $cc33;
}
?>


artimis

Never use this function to count/separate alphanumeric words, it will just split them up words to words, numbers to numbers.  You could refer to another function "preg_split" when splitting alphanumeric words.  It works with Chinese characters as well.

jtey

In the previous note, the example will only extract from the string, words separated by exactly one space.  To properly extract words from all strings, use regular expressions.
Example (extracting the first 4 words):
<?php
$string = "One    two three       four  five six";
echo implode(" ", array_slice(preg_split("/\s+/", $string), 0, 4));
?>
The above $string would not have otherwise worked when using the explode() method below.


lwright

If you are looking to count the frequency of words, try:
<?php
$wordfrequency = array_count_values( str_word_count( $string, 1) );
?>


andrea

if string doesn't contain the space " ", the explode method doesn't do anything, so i've wrote this and it seems works better ... i don't know about time and resource
<?php
function str_incounter($match,$string) {
$count_match = 0;
for($i=0;$i<strlen($string);$i++) {
if(strtolower(substr($string,$i,strlen($match)))==strtolower($match)) {
$count_match++;
}
}
return $count_match;
}
?>
example
<?php
$string = "something:something!!something";
$count_some = str_incounter("something",$string);
// will return 3
?>


olivier

I will not discuss the accuracy of this function but one of the source codes above does this.
<?php
function wrdcnt($haystack) {
$cnt = explode(" ", $haystack);
return count($cnt) - 1;
}
?>
That could be replace by
<?php
function wrdcnt($haystack) {
return substr_count($haystack,' ') + 1;
}
?>
I doubt this does need to be a function :)


josh

I was interested in a function which returned the first few words out of a larger string.
In reality, I wanted a preview of the first hundred words of a blog entry which was well over that.
I found all of the other functions which explode and implode strings to arrays lost key markups such as line breaks etc.
So, this is what I came up with:
function WordTruncate($input, $numWords) {
if(str_word_count($input,0)>$numWords)
{
$WordKey = str_word_count($input,1);
$WordIndex = array_flip(str_word_count($input,2));
return substr($input,0,$WordIndex[$WordKey[$numWords]]);
}
else {return $input;}
}
While I haven't counted per se, it's accurate enough for my needs. It will also return the entire string if it's less than the specified number of words.
The idea behind it? Use str_word_count to identify the nth word, then use str_word_count to identify the position of that word within the string, then use substr to extract up to that position.
Josh.


gorgonzola

i tried to write a wordcounter and ended up with this:
<?php
//strip html-codes or entities
$text = strip_tags(strtr($text, array_flip(get_html_translation_table(HTML_ENTITIES))));
//count the words
$wordcount = preg_match_all("#(\w+)#", $text, $match_dummy );
?>


joshua dot blake

I needed a function which would extract the first hundred words out of a given input while retaining all markup such as line breaks, double spaces and the like. Most of the regexp based functions posted above were accurate in that they counted out a hundred words, but recombined the paragraph by imploding an array down to a string. This did away with any such hopes of line breaks, and thus I devised a crude but very accurate function which does all that I ask it to:
function Truncate($input, $numWords)
{
 if(str_word_count($input,0)>$numWords)
 {
$WordKey = str_word_count($input,1);
$PosKey = str_word_count($input,2);
reset($PosKey);
foreach($WordKey as $key => &$value)
{
$value=key($PosKey);
next($PosKey);
}
return substr($input,0,$WordKey[$numWords]);
 }
 else {return $input;}
}
The idea behind it? Go through the keys of the arrays returned by str_word_count and associate the number of each word with its character position in the phrase. Then use substr to return everything up until the nth character. I have tested this function on rather large entries and it seems to be efficient enough that it does not bog down at all.
Cheers!
Josh


aurelien marchand

I found a more reliable way to print, say the first 100 words and then print elipses. My code goes this way;
$threshold_length = 80; // 80 words max
$phrase = "...."; // populate this with the text you want to display
$abody = str_word_count($phrase,2);
if(count($abody) >= $threshold_length){ // gotta cut
 $tbody = array_keys($abody);
 echo "

" . substr($phrase,0,$tbody[$threshold_length]) . "... <span class=\"more\"><a href=\"?\">read more</a></span> \n";
} else { // put the whole thing
 echo "

" . $phrase . "\n";
}
For any questions, com.iname@artaxerxes2


geertdd

Here's a very fast word limiter function that preserves the original whitespace.
<?php
function word_limiter($str, $limit = 100, $end_char = '&#8230;') {
   
   if (trim($str) == '')
       return $str;
   
   preg_match('/\s*(?:\S*\s*){'. (int) $limit .'}/', $str, $matches);
   if (strlen($matches[0]) == strlen($str))
       $end_char = '';
   return rtrim($matches[0]) . $end_char;
}
?>
For the thought process behind this function, please read: http://codeigniter.com/forums/viewthread/51788/
Geert De Deckere


madcoder

Here's a function that will trim a $string down to a certian number of words, and add a...   on the end of it.
(explansion of muz1's 1st 100 words code)
----------------------------------------------
function trim_text($text, $count){
$text = str_replace("  ", " ", $text);
$string = explode(" ", $text);
for ( $wordCounter = 0; $wordCounter <= $count;wordCounter++ ){
$trimed .= $string[$wordCounter];
if ( $wordCounter < $count ){ $trimed .= " "; }
else { $trimed .= "..."; }
}
$trimed = trim($trimed);
return $trimed;
}
Usage
------------------------------------------------
$string = "one two three four";
echo trim_text($string, 3);
returns:
one two three...


rcatinterfacesdotfr

Here is another way to count words :
$word_count = count(preg_split('/\W+/', $text, -1, PREG_SPLIT_NO_EMPTY));


30-jan-2007 04:15

Here is a php work counting function together with a javascript version which will print the same result.
<?php
     //Php word counting function
     function word_count($theString)
 {
   $char_count = strlen($theString);
       $fullStr = $theString." ";
       $initial_whitespace_rExp = "^[[:alnum:]]$";
       
       $left_trimmedStr = ereg_replace($initial_whitespace_rExp,"",$fullStr);
$non_alphanumerics_rExp = "^[[:alnum:]]$";
       $cleanedStr = ereg_replace($non_alphanumerics_rExp," ",$left_trimmedStr);
       $splitString = explode(" ",$cleanedStr);
       
       $word_count = count($splitString)-1;
       
       if(strlen($fullStr)<2)
{
         $word_count=0;
       }      
       return $word_count;
     }
?>
<?php
     //Function to count words in a phrase
     function wordCount(theString)
 {
   var char_count = theString.length;
       var fullStr = theString + " ";
       var initial_whitespace_rExp = /^[^A-Za-z0-9]+/gi;
       var left_trimmedStr = fullStr.replace(initial_whitespace_rExp, "");
var non_alphanumerics_rExp = rExp = /[^A-Za-z0-9]+/gi;
       var cleanedStr = left_trimmedStr.replace(non_alphanumerics_rExp, " ");
       var splitString = cleanedStr.split(" ");
       
       var word_count = splitString.length -1;
       
       if (fullStr.length <2)
{
         word_count = 0;
       }      
       return word_count;
     }
?>


tim

As used above:
"/[^(\w|\d|\'|\"|\.|\!|\?|;|,|\\|\/|\-\-|:|\&|@)]+/";
using this pattern for counting words, does anyone else have a problem with someone puts quotes anywhere in the body?  for me, it cuts off the rest of the data in the field, and just puts the pre-quote info into the db.


cathy

A cute little function for truncating text to a given word limit:
<?php
function limit_text($text, $limit) {
 if (strlen($text) > $limit) {
  $words = str_word_count($text, 2);
  $pos = array_keys($words);
  $text = substr($text, 0, $pos[$limit]) . '...';
 }
 return $text;
}
?>


Change Language


Follow Navioo On Twitter
addcslashes
addslashes
bin2hex
chop
chr
chunk_split
convert_cyr_string
convert_uudecode
convert_uuencode
count_chars
crc32
crypt
echo
explode
fprintf
get_html_translation_table
hebrev
hebrevc
html_entity_decode
htmlentities
htmlspecialchars_decode
htmlspecialchars
implode
join
levenshtein
localeconv
ltrim
md5_file
md5
metaphone
money_format
nl_langinfo
nl2br
number_format
ord
parse_str
print
printf
quoted_printable_decode
quotemeta
rtrim
setlocale
sha1_file
sha1
similar_text
soundex
sprintf
sscanf
str_getcsv
str_ireplace
str_pad
str_repeat
str_replace
str_rot13
str_shuffle
str_split
str_word_count
strcasecmp
strchr
strcmp
strcoll
strcspn
strip_tags
stripcslashes
stripos
stripslashes
stristr
strlen
strnatcasecmp
strnatcmp
strncasecmp
strncmp
strpbrk
strpos
strrchr
strrev
strripos
strrpos
strspn
strstr
strtok
strtolower
strtoupper
strtr
substr_compare
substr_count
substr_replace
substr
trim
ucfirst
ucwords
vfprintf
vprintf
vsprintf
wordwrap
eXTReMe Tracker