Clustering patent document in the field of ICT (Information & Communication Technology)

Agus Widodo, Indra Budi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

The current classification of patent data that refers to the IPC (International Patent Classification) of the WIPO (World Intellectual Property Organization), deemed not reflect the classification of the field of ICT (Information & Communication Technology). ICT applications are usually included in sections G (Physics) and H (Electricity). This paper will evaluate the eight groupings of patents based on the IPC classes (G01, G06, G09, G11, H01, H03, H04, and H06) of patents registered in the Directorate General of Intellectual Property Rights in Indonesia, from the year 1991 to 2000. The algorithm used to grouping is KMeans, KMeans, Hierchical Clustering, and a combination of these three algorithms with SVD (Singular Value Decomposition). For external validation, Purity and F-Measure are used, whereas Silhouette is used for internal validation. From the experimental results it can be concluded that SVD provides improvements to the clustering results. In addition, the use of abstract does not necessarily improve the performance of clustering, and the use of phrase does not always yield better cluster than the use of the word as index. Moreover, no cluster has purity measure greater than 50%, which means that the existing IPC classification has not been able to accommodate the field of ICT appropriately.

Original languageEnglish
Title of host publication2011 International Conference on Semantic Technology and Information Retrieval, STAIR 2011
Pages203-208
Number of pages6
DOIs
Publication statusPublished - 2011
Event2011 International Conference on Semantic Technology and Information Retrieval, STAIR 2011 - Putrajaya, Malaysia
Duration: 28 Jun 201129 Jun 2011

Publication series

Name2011 International Conference on Semantic Technology and Information Retrieval, STAIR 2011

Conference

Conference2011 International Conference on Semantic Technology and Information Retrieval, STAIR 2011
Country/TerritoryMalaysia
CityPutrajaya
Period28/06/1129/06/11

Keywords

  • Clustering
  • Information & Communication Technology
  • Kmeans
  • Patent
  • Singular Value Decomposition

Fingerprint

Dive into the research topics of 'Clustering patent document in the field of ICT (Information & Communication Technology)'. Together they form a unique fingerprint.

Cite this