合肥生活安徽新闻合肥交通合肥房产生活服务合肥教育合肥招聘合肥旅游文化艺术合肥美食合肥地图合肥社保合肥医院企业服务合肥法律

CP1407代做、代写c/c++,Java程序
CP1407代做、代写c/c++,Java程序

时间:2024-12-13  来源:合肥网hfw.cc  作者:hfw.cc 我要纠错



CP1407 Assignment 2 
 
- Page 1 - 
 
 
Note: This is an individual assignment. While it is expected that students will 
discuss their ideas with one another, students need to be aware of their 
responsibilities in ensuring that they do not deliberately or inadvertently 
plagiarise the work of others. 
 
 
Assignment 2 – Practice on various Machine Learning algorithms 
 
 
 
 1. [Data Pre-Processing, Clustering] [10 marks] 
Why is attribute scaling of data important? The following table contains sample 
records having the number of numbers and the total revenue generated by particular 
stores of a supermarket. Use the table as an example to discuss the necessity of 
normalisation in any proximity measurement for clustering purposes. 
 
Supermarket ID Employee Count Revenue 
001 38 $5,500,000 
002 29 $5,000,000 
003 24 $5,000,000 
004 10 $890,000 
005 40 $2,500,000 
006 31 $3,200,000 
007 14 $678,000 
008 35 $5,200,000 
009 30 $5,300,000 
010 22 $5,500,000 
 
 
 
 
2. [Classification – Decision Tree algorithm] [20 marks] 
Use the soybean dataset (diabetes.arff) to perform decision tree induction in Weka 
using three different decision tree induction algorithms; J48, REPTree, and 
RandomTree. Investigate different options, particularly looking at differences between 
pruned trees and unpruned trees. In discussing your results, consider the following 
questions. 
 
a) What are the effects of pruning on the results for the soybean datasets? 
b) Are there differences in the performances of the three decision tree algorithms? 
c) What impacts do other parameters of the algorithms have on the results? 
 
3. [Classification – Naïve Bayes algorithm] [30 marks] 
Suppose we have data on a few individuals randomly examined for basic health check. 
The following table gives the data on these individuals’ health-related attributes. CP1407 Assignment 2 
 
- Page 2 - 
Body 
Weight 
Body 
Height 
Blood 
Pressure 
Blood Sugar 
Level 
Habit Class 
Heavy Tall High 3 Smoker P 
Heavy Short High 1 Nonsmoker P 
Normal Tall Normal 3 Nonsmoker N 
Heavy Tall Normal 2 Smoker N 
Low Medium Normal 2 Nonsmoker N 
Low Tall Normal 1 Nonsmoker P 
Normal Medium High 3 Smoker P 
Low Short High 2 Smoker P 
Heavy Tall High 2 Nonsmoker P 
Low Medium Normal 3 Smoker P 
Heavy Medium Normal 3 Smoker N 
 
 Use the data together with the Naïve Bayes classifier to perform a new classification for 
the following new instance. Create and use the classifier by hand, not with Weka, and 
show all your working. 
Body 
Weight 
Body 
Height 
Blood 
Pressure 
Blood Sugar 
Level 
Habit Class 
Low Tall High 2 Smoker ? 
 
 4. [Association Rules Mining] [20 marks] 
The following table film watching histories for several viewers of an on-demand service. 
 
User Id Items 
001 Airplane!, Downfall, Evita, Idiocracy, Jurassic Park 
002 Casablanca, Downfall, Evita, Flubber, Jurassic Park 
003 Airplane!, Downfall, Half Baked, Jurassic Park 
004 Airplane!, Downfall 
005 Casablanca, Downfall, Flubber, Jurassic Park, Zoolander 
006 Casablanca, Downfall, Half Baked, Idiocracy, Zoolander 
007 Evita, Idiocracy, Jurassic Park 
008 Downfall, Jurassic Park, Zoolander 
009 Casablanca, Downfall, Evita, Half Baked, Jurassic Park, Zoolander 
 
a) Follow the steps outlined in Practical 07 and conduct a mining task for Boolean 
association rules using the Apriori algorithm in Weka. 
b) Set different parameters and observe the association rules discovered. 
c) Weka provides association evaluation parameters other than support and 
confidence. Note the evaluation results by those evaluation parameters of example 
rules. 
 CP1407 Assignment 2 
 
- Page 3 - 
 
5. [Clustering] [20 marks] 
Consider the following 2-dimensional point data set presented in (x,y) coordinates: 
 P1(1,1), P2(1,3), P3(4,3), P4(5,4), P5(9,4), P6(9, 6). 
Apply the hierarchical clustering method by hand (using Agglomerative algorithm) to 
get final two clusters. Use the Manhattan distance function to measure the distance 
between points and use the single-linkage scheme to do clustering. Show all your 
working. 
 
Rubric 
 Exemplary Good Satisfactory Limited Very Limited 
 90-100% 70-80% 50-60% 30-40% 0-20% 


请加QQ:99515681  邮箱:99515681@qq.com   WX:codinghelp


 

扫一扫在手机打开当前页
  • 上一篇:UFUG2601代做、代写C++设计程序
  • 下一篇:菲律宾移民局学生签证办理手续(留学要准备啥材料)
  • ·代做CS-107、java程序语言代写
  • ·代写EE5434、代做c/c++,Java程序
  • ·MS3251代写、代做Python/Java程序
  • ·COMP4134代做、Java程序语言代写
  • ·代写ENG4200、Python/Java程序设计代做
  • ·代写I&C SCI 46 、c/c++,Java程序语言代做
  • ·CCIT4020代做、代写c/c++,Java程序设计
  • ·代写COMP2011J、Java程序设计代做
  • ·IS3240代做、代写c/c++,Java程序语言
  • ·代写CSE x25、C++/Java程序设计代做
  • 合肥生活资讯

    合肥图文信息
    戴纳斯帝壁挂炉全国售后服务电话24小时官网400(全国服务热线)
    戴纳斯帝壁挂炉全国售后服务电话24小时官网
    菲斯曼壁挂炉全国统一400售后维修服务电话24小时服务热线
    菲斯曼壁挂炉全国统一400售后维修服务电话2
    美的热水器售后服务技术咨询电话全国24小时客服热线
    美的热水器售后服务技术咨询电话全国24小时
    海信罗马假日洗衣机亮相AWE  复古美学与现代科技完美结合
    海信罗马假日洗衣机亮相AWE 复古美学与现代
    合肥机场巴士4号线
    合肥机场巴士4号线
    合肥机场巴士3号线
    合肥机场巴士3号线
    合肥机场巴士2号线
    合肥机场巴士2号线
    合肥机场巴士1号线
    合肥机场巴士1号线
  • 币安app官网下载 短信验证码

    关于我们 | 打赏支持 | 广告服务 | 联系我们 | 网站地图 | 免责声明 | 帮助中心 | 友情链接 |

    Copyright © 2024 hfw.cc Inc. All Rights Reserved. 合肥网 版权所有
    ICP备06013414号-3 公安备 42010502001045