Matches in ScholarlyData for { <https://w3id.org/scholarlydata/inproceedings/www2012/poster/125> ?p ?o. }
Showing items 1 to 15 of
15
with 100 items per page.
- 125 creator carlos-r-rivero.
- 125 creator david-ruiz.
- 125 creator inma-hernandez.
- 125 creator rafael-corchuelo.
- 125 type InProceedings.
- 125 label "A Statistical Approach to URL-Based Web Page Clustering".
- 125 sameAs 125.
- 125 abstract "Many techniques for web page classification have been proposed in the past. Most of them use features from the page to be classified, which means that even if the page is not of an interesting class, it has to be downloaded to find it out. We propose a technique to cluster web pages by means of their URL exclusively. In contrast to other proposals, we analyse features that are outside the page, hence, we do not need to download a page to classify it. Also, it is non-supervised, requiring little intervention from the user. Furthermore, we do not need to crawl extensively a site in order to train and build a classifier for that site, but only a small subset of pages. We have performed an experiment over several highly visited sites to evaluate the performance of our classifier, obtaining good precision and recall results.".
- 125 hasAuthorList authorList.
- 125 isPartOf proceedings.
- 125 isPartOf proceedings.
- 125 keyword "URL Classification".
- 125 keyword "URL Patterns".
- 125 keyword "Web Page Clustering".
- 125 title "A Statistical Approach to URL-Based Web Page Clustering".