CMPF104: Data Cleaning and Preprocessing: Data science and Data Anaytics: Programming For Foundation In Engineering, Assignment, UNITEN, Malaysia
| University | Universiti Tenaga Nasional (UNITEN) |
| Subject | CMPF104: Programming For Foundation In Engineering |
Data science and Data Anaytics
Download the dataset from BRIGHTEN. If your student ID ends with an odd number, select Concrete_Data_A dataset, and if your student ID ends with an even number, select Concrete_Data_B dataset. Using the Python attributes, function and libraries to solve the following problems.
a) Data Cleaning and Preprocessing:
- Use Pandas to load the dataset. Name the dataframe as concrete_df_XXX.
- Remove ‘Number’ column using .drop() function and visualize the first ten (10)
rows of the data. - Handle any missing values by dropping or replacing the empty cells. Check for missing values using functions like .info() or .isnull().sum()
- Convert the data frame to array, using to_numpy() function.
- Divide the data into two sets of data with division of 80% and 20% for train and test data, respectively. Name the dataset as train_data_XXX and test_data_XXX
b) Data Analysis:
- Calculate the correlation between the variables in the dataframe.
- Utilize NumPy and Pandas to calculate summary statistics of the data such as
maximum, minimum, standard deviation, average, median and mode of each
category. - Use Pandas functions like .describe() for an overview of summary statistics and apply NumPy functions for specific calculations.
c) Visualization:
- Use Matplotlib to create visualizations such as line plots for train and test data
across all categories. - Generate histogram plots and box plots for all variables.
- Ensure that the visualizations are clear, informative, and aesthetically pleasing.
- Customize your plots by adding the titles, labels and legends
Get Help By Expert
Recent Solved Questions
- You are planning for your financial retirement. You want to retire at 60. You plan to have an endowment fund that will give you a monthly income of RM 10,000 per month after retirement: Financial Management Assignment, UTP, Malaysia
- Create your own physical education center, You can draw and color your floor plan in either 2D or 3D: Physical Development of Young Children, Essay, OUM, Malaysia
- BBUI3103: Employment and Industrial Law Assignment, OUM, Malaysia Differentiate between dismissal and termination and consider whether the grounds of dismissal and termination
- DJM50113 Industrial Automation Practical Task 4 Human Machine Interface (HMI) [MELFA RV-AJ ARTICULATED]
- Sustainable Project Management Assignment, UTM, Malaysia Decision-making for sustainable project management can be a very arduous task in which decision-making methods
- English for Workplace communication, OUM, Malaysia The increase in mobile phone use has transformed the way we live, communicate and do business
- PHT485 Exercise Physiology Assignment: Exercise Recommendations for Managing Hypertension
- Auditing Assignment, UMS, Malaysia ISA 700 (Revised), Forming An Opinion And Reporting On Financial Statements, deals with the auditor’s responsibility
- BMG306/03: Introduction to International Business Case Study, WOU, Malaysia In April 1992, EuroDisney first opened in Paris, France It later changed its name to Disneyland Paris
- Operations Management Assignment, MU, Malaysia You are designing a grocery delivery business. Via the Internet, your company will offer ALL fresh and frozen foods