Python Pandas or BeautifulSoup, anyone?
I’m not a technical person (I struggle with WordPress, and only recently opened by first twitter account – add me). I came across Python when I was in college – back when it wasn’t “in”. It has grown rampantly over the years, and is becoming (if it hasn’t already) a prerequisite for any kind of data related job.
And most jobs, and most of everything, is now data driven. That, however, is not my motivation.
I am not seeking job marketability.
I’m just intellectually curious, and I think it’d be a good hobby when I’m actually FI :).
I picked up VBA in Excel this year, and I’m amazed at the power that has given me. I’ve enjoyed it greatly – and it has given me the confidence that maybe I can program.
I am, after all, and just like you, a logical person — more logical than some of my programming buddies.
Within Python, Pandas and BeautifulSoup are two libraries I’m keen to explore. Pandas for data analysis, and BeautifulSoup for web-scraping. A recent planet money podcast featured a guy who tracked text book prices on Amazon. I thought that was just incredible. The most popular Python libraries, I’m sure you’d like to see, are here.
I installed Python, but couldn’t get it running. There are lots of youtube tutorials, and my issues were unique. You can follow these steps to get it running – it’s quick, it’s free and the scope is beyond anything I am aware of at the moment.
- Install Python. I got the last version, not the latest one (Windows x86 MSI installer)
- Once installed, open command prompt (click on start / windows, and type cmd)
- Find out where you’ve housed Python (enter “dir + directory name” to open directory and “cd..” or “cd + file name” to close and close the directory respectively)
- Install Pandas (“pip install pandas”). I also installed numpy, xlwing (for excel), and jupyter (IDE).
- If the installations err, try upgrading the pip (“pip install –upgrade setuptools” and/or “pip install –upgrade pip”) first and then try install pandas again. I was able to get pandas after upgrading pip.
- Open Jupyter “C:\Python34\Scripts>jupyter notebook” (or “C:\appdata\Python34\Scripts>jupyter notebook”)