Derived from: none
Declared in: HTMLFile.h
HTMLFile is the class used to represent an html text file, and to parse it. You can also add some beacons to retreive specific information from the page, and extract all the page's labels and links.
|
|
|
Adds a beacon at the end of the HTMLFile's beacons list. The order is very important, because it is in that order that the beacons will be searched. If a beacon isn't found, those after it won't be found either.
The beacon is now kept by the HTMLFile. You musn't delete it. HTMLFile will do so when destroyed. For this reason, the beacon musn't be allocated on the stack.
Always call AddBeacon() before Search().
|
Prints to the standard output all the beacons, labels and links found after a Search() call.
|
Gets the links and labels found in the html file. Call this function only after Search(). You can use the beacon parameters to only get the links and labels between two specific beacons. For this you can specify the beacon numbers, or the beacons themselves.
If you use the integer version, an unsigned int of 1 represents the first beacon, 2 is the second one etc... If you have n beacons, valid values are between 0 and n+1. That way you can also retreive labels and links before the first beacon, and after the last one. Use beacon1=0 and beacon2=n+1 to retreive all the links and labels. If you don't use any beacons, use beacon1=0 and beacon2=1.
The links and labels found will be put in the two given BLists: links and labels. If one of the lists is NULL, it won't be filled. (useful if you want only the labels or only the links). Both lists contain HTMLLabel pointers, that shouldn't be destroyed. They are owned by the HTMLFile.
|
Parses the html file. If you want to put some beacons, don't forget to call AddBeacons() before Search(). Use the GetLists() function to retreive the found links and labels.
|