Lendo dados de arquivos

Vamos mostrar como ler dados de arquivos TXT, CSV, XLSX e JSON.

Lendo TXT

Para ler arquivos txt vamos utilizar a biblioteca Numpy.

In [77]:
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('ggplot')

data = np.loadtxt('./data/data.txt')

data
Out[77]:
array([[21.36, 23.45],
       [10.54, 45.23],
       [23.54, 24.45],
       [58.2 , 21.23],
       [10.23, 36.87],
       [45.21, 18.12]])
In [78]:
plt.plot(data)
Out[78]:
[<matplotlib.lines.Line2D at 0x2708fbec5f8>,
 <matplotlib.lines.Line2D at 0x2708fbec780>]

Lendo CSV

Aqui usaremos a biblioteca Pandas

In [70]:
import pandas as pd

df=pd.read_csv('./data/data.csv', sep=',',header=0)

df
Out[70]:
Date Value
0 01/10/2018 1413
1 02/10/2018 1350
2 03/10/2018 326
3 04/10/2018 266
4 05/10/2018 210
In [71]:
plt.plot(df['Date'],df['Value'])
Out[71]:
[<matplotlib.lines.Line2D at 0x2708fa52ba8>]

Lendo XLSX

a biblioteca Pandas tem métodos para ler esse tipo de arquivo.

In [72]:
excel = pd.ExcelFile('./data/data.xlsx')

df=excel.parse('Sheet2')
df
Out[72]:
Pais Populacao (milhoes)
0 China 1413
1 India 1350
2 EUA 326
3 Indonesia 266
4 Brasil 210
In [76]:
labels = df['Pais']
sizes = df['Populacao (milhoes)']
explode = (0.1, 0, 0, 0,0) 

fig1, ax1 = plt.subplots()
ax1.pie(sizes, explode=explode, labels=labels, autopct='%1.1f%%',
        shadow=True, startangle=90)
ax1.axis('equal') 
Out[76]:
(-1.2187432589078402,
 1.1254979097930768,
 -1.1117512997851087,
 1.1340720638305084)

Lendo JSON

Para esse caso vamos usar uma biblioteca nativa da linguagem chamada json.

In [100]:
import json
 
with open('./data/data.json') as f:
    data = json.load(f)
    
print(data)
[{'id': 1, 'first_name': 'Jeanette', 'last_name': 'Penddreth', 'email': 'jpenddreth0@census.gov', 'age': 19, 'ip_address': '26.58.193.2'}, {'id': 2, 'first_name': 'Giavani', 'last_name': 'Frediani', 'email': 'gfrediani1@senate.gov', 'age': 23, 'ip_address': '229.179.4.212'}, {'id': 3, 'first_name': 'Noell', 'last_name': 'Bea', 'email': 'nbea2@imageshack.us', 'age': 25, 'ip_address': '180.66.162.255'}, {'id': 4, 'first_name': 'Willard', 'last_name': 'Valek', 'email': 'wvalek3@vk.com', 'age': 34, 'ip_address': '67.76.188.26'}]
In [98]:
fig, ax = plt.subplots()
ages = []
names = []
for item in range(len(data)):
    names.append(data[item]['first_name'])
    ages.append(data[item]['age'])    

plt.bar(names, ages)

plt.xticks(names)
Out[98]:
([<matplotlib.axis.XTick at 0x2708fde74e0>,
  <matplotlib.axis.XTick at 0x2708fde3c88>,
  <matplotlib.axis.XTick at 0x2708fde39b0>,
  <matplotlib.axis.XTick at 0x2708fe0d588>],
 <a list of 4 Text xticklabel objects>)