• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer navigation

The Geek Diary

  • OS
    • Linux
    • CentOS/RHEL
    • Solaris
    • Oracle Linux
    • VCS
  • Interview Questions
  • Database
    • oracle
    • oracle 12c
    • ASM
    • mysql
    • MariaDB
  • DevOps
    • Docker
    • Shell Scripting
  • Big Data
    • Hadoop
    • Cloudera
    • Hortonworks HDP

ValueError: Masked arrays must be 1-D

by admin

Scatter plot is a basic plot of dots. You can draw it by calling plt.scatter(x,y). The following example shows a scatter plot of random dots:

import numpy as np
import matplotlib.pyplot as plt

# Set the random seed for NumPy function to keep the results reproducible
np.random.seed(42)

# Generate a 2 by 100 NumPy Array of random decimals between 0 and 1
r = np.random.rand(2,100)

# Plot the x and y coordinates of the random dots on a scatter plot
plt.scatter(r[0],r[1])

# Show the plot
plt.show()

The following plot is the result of the preceding code:

scatter plot python

The following code was written in python to generate a scatter plot of the test data after splitting it into test and train data. The data being read from a sample file was converted to matrix to be able to use the data for the the linear regression algorithm.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

data=pd.read_csv('data_1.csv',encoding='utf-8')
print data.head()

X_data=np.matrix(data['X1'])
Y_data=np.matrix(data['Y1'])
X_train,X_test,Y_train,Y_test=train_test_split(np.transpose(X_data),\
                                    np.transpose(Y_data), test_size=0.3)
plt.scatter(X_test, Y_test,  color='black')

But using the same data for the scatter plot generated the following error.

ValueError: Masked arrays must be 1-D

This is because we are passing X_test and Y_test and matrices to scatter plot but which actually are one dimensional arrays. Thus to get around the problem we can cast the data into an numpy array.

plt.scatter(np.array(X_test), np.array(Y_test),  color='black')

The above modification should stop the error from being thrown. The modified code will be.

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

data=pd.read_csv('data_1.csv',encoding='utf-8')
print data.head()

X_data=np.matrix(data['X1'])
Y_data=np.matrix(data['Y1'])
X_train,X_test,Y_train,Y_test=train_test_split(np.transpose(X_data),\
                                    np.transpose(Y_data), test_size=0.3)

plt.scatter(np.array(X_test), np.array(Y_test),  color='black')

Filed Under: DevOps, Python

Some more articles you might also be interested in …

  1. How to use shell aliases in Linux
  2. My Development Environment Set up on Windows to use Python for Web Dev & Data Science
  3. How To Access Kubernetes Dashboard Externally
  4. “docker dead but subsys locked” – error while starting docker
  5. How to List / Search / Pull docker images on Linux
  6. Bash for loop Examples
  7. Shell Script to print pyramid of Stars
  8. Examples of “shift” Command in Shell Scripts
  9. How to use until loop in Shell Scripts
  10. Using Loops (while, for) in awk scripts

You May Also Like

Primary Sidebar

Recent Posts

  • JavaFX ComboBox: Set a value to the combo box
  • Nginx load balancing
  • nginx 504 gateway time-out
  • Images preview with ngx_http_image_filter_module

© 2022 · The Geek Diary

  • Archives
  • Contact Us
  • Copyright