CMPF104: Data Cleaning and Preprocessing: Data science and Data Anaytics: Programming For Foundation In Engineering, Assignment, UNITEN, Malaysia
| University | Universiti Tenaga Nasional (UNITEN) |
| Subject | CMPF104: Programming For Foundation In Engineering |
Data science and Data Anaytics
Download the dataset from BRIGHTEN. If your student ID ends with an odd number, select Concrete_Data_A dataset, and if your student ID ends with an even number, select Concrete_Data_B dataset. Using the Python attributes, function and libraries to solve the following problems.
a) Data Cleaning and Preprocessing:
- Use Pandas to load the dataset. Name the dataframe as concrete_df_XXX.
- Remove ‘Number’ column using .drop() function and visualize the first ten (10)
rows of the data. - Handle any missing values by dropping or replacing the empty cells. Check for missing values using functions like .info() or .isnull().sum()
- Convert the data frame to array, using to_numpy() function.
- Divide the data into two sets of data with division of 80% and 20% for train and test data, respectively. Name the dataset as train_data_XXX and test_data_XXX
b) Data Analysis:
- Calculate the correlation between the variables in the dataframe.
- Utilize NumPy and Pandas to calculate summary statistics of the data such as
maximum, minimum, standard deviation, average, median and mode of each
category. - Use Pandas functions like .describe() for an overview of summary statistics and apply NumPy functions for specific calculations.
c) Visualization:
- Use Matplotlib to create visualizations such as line plots for train and test data
across all categories. - Generate histogram plots and box plots for all variables.
- Ensure that the visualizations are clear, informative, and aesthetically pleasing.
- Customize your plots by adding the titles, labels and legends
Get Help By Expert
Recent Solved Questions
- Proposal: Consumer Loyalty and Smart Retailing in Philippine Fast Food
- Hospital Engineering Report, UPM, Malaysia A group of students will be part of a project in building a Medical Centre of max 50 beds hospital in Malaysia
- SMP10803: Principles of Marketing Assignment, UMP, Malaysia You are planning to venture into one of the agribusinesses in the area in which you live Discuss 3 (THREE) major external forces that will impact your future business.
- Programming Fundamental Assignment, TU, Malaysia Morgan Stark is a child prodigy just like his father Tony Stark when he was young. At five years old, she self-taught herself
- OUMH1603: How can we balance the urgent need for global action on climate change with the unique challenges: Learning Skills For 21st Century, Assignment, OUM, Malaysia
- Finance Assignment, APU, Malaysia Salma is considering investing in either one of the jewelry companies – Pearl Berhad or Diamond Berhad
- LC1204: Non-verbal Communication Assignment, UTB, Malaysia Identify the non-verbal cues in a particular context/setting e.g., classroom setting or family gathering, or official event
- STA10003: Foundations of Statistics Assignment, SUT, Malaysia According to a web advertising company, the impact of pop-up ads depends on age. A random sample of 60 web surfers
- BAGB1013: What are FOUR (4) key performance metrics mentioned in the article that indicate the recovery: Principles Of Management Essay, UIU, Malaysia
- BUS707: Strategic Direction Assignment, LIBT, Malaysia Critically analyze the current strategic aims and objectives of your chosen scenario/organization