代写CS-UY 4563、Python程序语言代做

代写CS-UY 4563、Python程序语言代做
代写CS-UY 4563、Python程序语言代做

时间：2024-12-12 来源：合肥网hfw.cc 作者：hfw.cc 我要纠错

Final Project
CS-UY 4563 - Introduction to Machine Learning
Overview
• Partner with one student and select a machine learning problem of your choice.
• Apply the machine learning techniques you’ve learned during the course to
your chosen problem.
• Present your project to the class at the semester’s end.
Submission Requirements on Gradescope
Submit the following on Gradescope by the evening before the ﬁrst presentation (exact
date to be announced):
• Presentation slides.
• Project write-up (PDF format).
• Project code as a Jupyter Notebook. If necessary, a GitHub link is acceptable.
• If using a custom dataset, upload it to Gradescope (or provide a GitHub link, if
necessary).
1Project Guidelines
Write-Up Requirements
Your project write-up should include the following:
1. Introduction: Describe your data set and the problem you aim to solve.
2. Perform some unsupervised analysis:
• Explore pattern or structure in the data using clustering and dimensionality (e.g
PCA).
• Visualize the training data
1
:
– Plot individual features to understand their distribution (e.g., histograms
or density plots).
– Plot individual features and their relationship with the target variable.
– Create a correlation matrix to analyze relationships between features.
• Discuss any interesting structure is present in the data. If you don’t ﬁnd any
interesting structure, describe what you tried.
3. Supervised analysis: Train at least three distinct learning models
2 discussed in
the class (such as Linear Regression, Logistic Regression, SVM, Neural Networks,
CNN).
3
For implementation, you may:
• Use your own implementation from homework or developed independently.
• Use libraries such as Keras, scikit-learn, or TensorFlow.
For each model,
4 you must:
• Try diﬀerent feature transformations. You should have at least three transformations.
For example, try the polynomial, PCA, or radial-basis function kernel.
For neural networks, diﬀerent architectures (e.g., neural networks with varying
numbers of layers) can also be considered forms of feature transformations, as
they learn complex representations of the input data.
• Use diﬀerent regularization techniques. You should have at least 6 diﬀerent
regularization values per model
1Do not look at the validation or test data.
2You can turn a regression task into a classiﬁcation task by binning, or for the same dataset, select a
diﬀerent feature as the target for your model. Or you can use SVR.
3
If you wish to use a model not discussed in class, you must discuss it with me ﬁrst, or you will not
receive any points for that model.
4Even if you get a very high accuracy, perform these transformations to see what happens.
24. Table of Results:
• Provide a table with training accuracy and validation metrics for every model.
Include results for the diﬀerent parameter settings (e.g., diﬀerent regularization
values).
– For classiﬁcation include metrics such as precision/recall.
– For regression modes, report metrics like MSE, R2
. For example, suppose
you’re using Ridge Regression and manipulating the value of λ. In that
case, your table should contain the training and validation accuracy for
every lambda value you used.
• Plot and analyze how performance metrics (like accuracy, precision, recall, MSE)
change with diﬀerent feature transformations, hyperparameters (e.g.regularization
settings, learning rate).
5. Analytical Discussion:
• Analyze the experimental results and explain key ﬁndings. Provide a chart of
your key ﬁndings.
• Highlight the impact of feature transformations, regularization, and other hyperparameters
on the model’s performance. Refer to the graphs provide in earlier
sections to support your analysis. Focus on interpreting:
– Whether the models overﬁt or underﬁt the data.
– How bias and variance aﬀect performance, and which parameter choices
helped achieve better generalization.
Presentation Guidelines
• You and your partner will give a six-minute presentation to the class.
• Presentations will be held during the last 2 or 3 class periods and during the ﬁnal
exam period for this class. You will be assigned a day for your presentation. If we
run out of time the day you are to present your project, you will present the next
day reserved for presentations.
• Attendance during all presentations is required. A part of your project grade
will be based on your attendance for everyone else’s presentation.
Important Notes on Academic Integrity
• Your submission will undergo plagiarism checks.
• If we suspect you of cheating, you will receive 0 for your ﬁnal project grade. See the
syllabus for additional penalties that may be applied.
3Dataset Resources
Below are some resources where you can search for datasets. As a rough guideline, your
dataset should have at least 200 training examples and at least 10 features. You
are free to use these resources, look elsewhere, or create your own dataset.
• https://www.kaggle.com/competitions
• https://www.openml.org/
• https://paperswithcode.com/datasets
• https://registry.opendata.aws/
• https://dataportals.org/
• https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
• https://www.reddit.com/r/datasets/
• https://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public
Modiﬁcations
• If you have a project idea that doesn’t satisfy all the requirements mentioned above,
please inform me, and we can discuss its viability as your ﬁnal project.
• If you use techniques not covered in class, you must demonstrate your understanding
of these ideas.
Brightspace Submissions Guidelines
• Dataset and Partner: Submit the link to your chosen dataset and your partner’s
name by October 30th.
• Final Submissions: Upload your presentation slides, project write-up, and code to
Gradescope by the evening before the ﬁrst scheduled presentation. The exact date
will be announced once the total number of projects is conﬁrmed. (I expect the due
date to be December 4th or December 9th.)
Potential Challenges and Resources
As you work with your dataset, you may encounter speciﬁc challenges that require additional
techniques or tools. Below are some topics and resources that might be useful.
Please explore these topics further through online research.
4• Feature Reduction: Consider using PCA (which will be covered in class). PCA is
especially useful when working with SVMs, as they can be slow with high-dimensional
data.
If you choose to use SelectKBest from scikit-learn, you must understand why it works
before you use it.
• Creating Synthetic Examples: When using SMOTE or other methods to generate
synthetic data, ensure that only real data is used in the validation and test sets.
- If using synthetic data, make sure your validation set and test set mirrors the true
class proportions from the original dataset. A balanced test set for naturally unbalanced
data can give misleading impressions of your model’s real-world performance.
For more details, see: Handling Imbalanced Classes
• Working with Time Series Data: For insights on working with time series data,
visit: NIST Handbook on Time Series
• Handling Missing Feature Values:
– See Lecture 16 at Stanford STATS 306B
– Techniques to Handle Missing Data Values
– How to Handle Missing Data in Python
– Statistical Imputation for Missing Data
• Multiclass Classiﬁcation:
– Understanding Softmax in Multiclass Classiﬁcation
– Precision and Recall for Multiclass Metrics
• Optimizers for Neural Networks: You may use Adam or other optimizers for
training neural networks.
• Centering Image Data with Bounding Boxes: If you are working with image
data, you are allowed to use bounding boxes to center the objects in your images. You
can use libraries like OpenCV (‘cv2’).
Tips
Don’t forget to scale your data as part of preprocessing. Be sure to document any modiﬁcations
you made, including the scaling or normalization techniques you applied.
The following resource might be helpful. Please stick to topics we discussed in class or
those mentioned above: CS229: Practical Machine Learning Advice

请加QQ：99515681 邮箱：99515681@qq.com WX：codinghelp

扫一扫在手机打开当前页

上一篇:菲律宾莱特省旅游经济好吗(景点推荐)

下一篇:ENG6编程代写、代做MATLAB语言程序

注：此文是出于传递更多信息之目的。所转载的内容，其版权均由原作者和资料提供方所拥有！若侵犯了您的合法权益，请联系我们，将及时更正、删除，谢谢。

代写CS-UY 4563、Python程序语言代做 代写CS-UY 4563、Python程序语言代做

代写CS-UY 4563、Python程序语言代做
代写CS-UY 4563、Python程序语言代做