Following on from the previous post, this example shows an example of how to use an STL multimap to track the line number(s) associated with each word in a text file.
This program essentially reads in text line-by-line, while stripping out all occurrences of punctuation and other non-alphanumeric charcters. Each pair is inserted into the multimap container using the insert function.
As with the previous posting, which deals with counting the frequency of words in a file, this example also uses the sample Hamlet.txt file.
Code listing as follows:
#include <iostream>
#include <sstream>
#include <fstream>
#include <map>
using namespace std;
int main()
{
const string path = "/home/andy/NetBeansProjects/Hamlet.txt"; //Linux
//const string path = "C:\\Dump\\Hamlet.txt";
ifstream input( path.c_str() );
if ( !input )
{
cout << "Error opening file." << endl;
return 0;
}
multimap< string, int, less<string> > words;
int line;
string word;
// For each line of text
for ( line = 1; input; line++ )
{
char buf[ 255 ];
input.getline( buf, 128 );
// Discard all punctuation characters, leaving only words
for ( char *p = buf;
*p != '\0';
p++ )
{
if ( !isalpha( *p ) )
*p = ' ';
}
istringstream i( buf );
while ( i )
{
i >> word;
if ( word != "" )
{
words.insert( pair<const string,int>( word, line ) );
}
}
}
input.close();
// Output results
multimap< string, int, less<string> >::iterator it1;
multimap< string, int, less<string> >::iterator it2;
for ( it1 = words.begin(); it1 != words.end(); )
{
it2 = words.upper_bound( (*it1).first );
cout << (*it1).first << " : ";
for ( ; it1 != it2; it1++ )
{
cout << (*it1).second << " ";
}
cout << endl;
}
return 0;
}
Giving the following output. Notice that multiple occurrences of words per line are mapped.
Related post: Counting the Number of Words in a Text File in STL / C++
