Lately, I’ve been coding with Python and building my own small programs and utilities. After building a successful simple timer and calculator I got to wondering, “is my code in the simplest most efficient state it can be?”. How does it translate when being read or used by other people? Does it perform well?
But what is Refactoring? Refactoring is the process of making your code easier to read and maintain usually without changing the runtime behavior of the program.
“Refactoring is the process of making your code easier to read and maintain without changing the runtime behaviour of the program”. …
Getting any kind of start in Data Science is likely to lead you into the world of NumPy. NumPy is designed to work extremely well with numbers and allow mathematical operations on such data. It has its own data structure — arrays that make working with homogenous data (data of the same type) faster and more efficient.
This article aims to introduce NumPy and its multi-dimensional array data structure. I recommend following along in a Jupyter Notebook which makes it easier to run the code and learn the library as you go along. By the end you will have learnt:
During the final weeks of my Flatiron Data Science Bootcamp, I worked on a classification machine learning project with a colleague. We experimented with a few different models learning to optimise the hyperparameters and output their predictions. We settled, finally, on a Random Forest ensemble model. It wasn’t until a month after the project — when I was asked if I knew the feature importances of the final model — that I realised I had completely forgotten to do extract them!
Extracting the feature importance values from your classifier is useful in several ways. It can provide business intelligence. Create a deeper understanding of the business and drive exploration into specific areas. It can help the explainability/ interpretability of the model and it can be used for feature selection that strengthens the final model performance. …
I decided enough is enough and I went on a journey to learn about environment variables and came out with much more.
I've been uneasy about my know-how around environment variables. $PATH, $HOME, $USER come up from time to time. I decided enough is enough and I went on a journey to learn about environment variables and came out with much more.
Through reading this article you will learn how to write small programs in python that can be executed anywhere from your terminal command line. Ultimately we will create a small program that is executed with the command:
hello
and returns ‘Hello…
Learning to use the permutation formula set for data science.
How many different permutations of the coloured squares on the face of the Rubiks cube above can there be?
In data science, it can be important to work out the possible permutations in a dataset for specific entities. By the end of this article you will learn how to calculate:
You will also be able to answer the question about the Rubiks cube above.
In mathematics, a permutation of a set is, loosely speaking, an arrangement of its members into a sequence or linear order, or if the set is already ordered, a rearrangement of its elements. …
Learning the Google Cloud API and Google My Business (GMB) framework was not straightforward. I had to go to many different sources and ask many questions along the way. In my efforts to obtain review data from a location using the GMB API, the process drove me to frustration and I laid the path as straight as I could for those who follow in this article.
That article allows you access to the reviews given that you already know your ‘locationId’ and ‘accountId’. More on that in the article above. …
$ pip install filprofiler%load_ext filprofiler # use the python fil kernel%%filprofile # run this in the cell you wish to evaluate
“the resulting sudden system failures, blue screens, and downtime were increasingly unacceptable in this day and age.”
I took some time to learn about how python handles its memory management using a reference counting and garbage collection system. Data Science can be bottlenecked by a systems CPU and memory among other things. A CPU struggling to keep up with the load typically appears as your system ‘slowing down’. However, memory issues like leaks and high memory usage spikes, etc. …
GET
https://mybusiness.googleapis.com/v4/accounts/{accountId}/locations/{locationId}/reviews
I began my journey into obtaining google reviews last week and was surprised to find that it wasn’t as straightforward as I thought it would be. I write this article to help others who wish to obtain google review data (and other data related to location).
Getting Started — Permissions
Enabling and Using the Google My Business API
Make sure you have access both to the project account on the Google Cloud
Platform (GCP) and to the business account on Google My Business (GMB). …
Introduction to the Transformer Library for Advanced NLP
[1]: from transformers import pipeline[2]: nlp = pipeline("sentiment-analysis")[3]: result = nlp("Lovely atmosphere, staff are super friendly and wonderful people.")[0][4]: print(f"label: {result['label']}, with score: {round(result['score'], 4)}")
I asked my Bootcamp cohort fellow “I want to do sentiment analysis on reviews for a friend, do you know of any good leads?”. He wrote back “Look at the Transformer library, you can do an accurate prediction with 3 lines of code”
Hello everyone,
I’ve been meaning to dive deeper into SQL. So I thought of a fun build idea that could use Python and SQL to keep track of materials in my favorite old school game Phantasy Star Online. Welcome to Simple Python Programs — Simple Tracker
In the video above I run through the creation of the program end to end.
The program uses sqlite3 and Python3. I use two other inbuilt libraries (sys, and subprocess) to clear the screen between prints and exit the program if needed.
I’m a really big fan of building simple utilities you can run from your…
About