Dataset threshold for the performance estimators in supervised machine learning experiments

Zanifa Omary, Fredrick Mtenzi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

The establishment of dataset threshold is one among the first steps when comparing the performance of machine learning algorithms. It involves the use of different datasets with different sample sizes in relation to the number of attributes and the number of instances available in the dataset. Currently, there is no limit which has been set for those who are unfamiliar with machine learning experiments on the categorisation of these datasets, as either small or large, based on the two factors. In this paper we perform experiments in order to establish dataset threshold. The established dataset threshold will help unfamiliar supervised machine learning experimenters to categorize datasets based on the number of instances and attributes and then choose the appropriate performance estimation method. The experiments will involve the use of four different datasets from UCI machine learning repository and two performance estimators. The performance of the methods will be measured using f1-score.

Original languageEnglish
Title of host publicationInternational Conference for Internet Technology and Secured Transactions, ICITST 2009
PublisherIEEE Computer Society
ISBN (Print)9781424456482
DOIs
Publication statusPublished - 2009
Externally publishedYes
EventInternational Conference for Internet Technology and Secured Transactions, ICITST 2009 - London, United Kingdom
Duration: 9 Nov 200912 Nov 2009

Publication series

NameInternational Conference for Internet Technology and Secured Transactions, ICITST 2009

Conference

ConferenceInternational Conference for Internet Technology and Secured Transactions, ICITST 2009
Country/TerritoryUnited Kingdom
CityLondon
Period9/11/0912/11/09

Fingerprint

Dive into the research topics of 'Dataset threshold for the performance estimators in supervised machine learning experiments'. Together they form a unique fingerprint.

Cite this