Data Analyst Job Skills Search

Excel - Power Query

Tableau - Interactive Dashboard

Backstory

Navigating the job market isn’t easy. I haven’t researched the numbers to back it up, but it sure feels like finding a data analyst position you’re a good fit for is akin to that old adage about a fish swimming upstream.

Data Science is a field growing at incredible rates, which is a good thing, right? Of course! The only challenge I can see is that the competition to land your dream job has matched its exponential growth. And with so many platforms and software available specializing in cleaning data, disseminating, and visualizing it, would-be candidates can understandably feel overwhelmed.

You can easily succumb to the pressure of trying to learn as many tools as possible, let alone being proficient at them.

So, out of genuine curiosity, I downloaded a Kaggle dataset of real-time job postings using the Google search term “data analyst”. My hope, using Tableau as a visual aid, is to identify which skills are most sought after, and confirm that I’m on the right path!

Process

After an initial review of the dataset, I decided to broaden the scope of my analysis. There is enough information to tackle a few other queries related to my own job search.

For instance,

  1. What skills are most sought after, specific to companies with an abundance of positions open?
  2. Is there a difference in the expected salary of the positions posted from one job board to another?

There are many fields to explore, but some of the more relevant metrics within the data include:

  • Company hiring for the position
  • Via (where the position is posted)
  • Salary (min, max, average, hourly, etc.)
  • When it was posted (all within the last 24 hours)
  • Work from home optionality
  • Desired skills for the position

From the get-go, it was clear that some fine tuning would be needed. The initial problem I found was the formatting. Primarily, the desired skills listed for each job opening was grouped together in a single cell. Not only that, when I first loaded the data, the job description section wreaked havoc during the interpretation. Values became misaligned and tested my patience beyond belief!

I could have fixed the formatting within Tableau, which has powerful capabilities. It’s not just a visual tool! But instead, I opened the file in Excel. Power Query can do all that I need in just a few clicks and with unquestionable confidence.

First, I split my skills field into multiple rows. Identifying the brackets, commas, and white spaces as delimiters, Power Query expanded the section and more importantly, duplicated the other fields, allowing me to reference the appropriate index numbers during my aggregations.

I took it a step further and applied a TRIM function to ensure I didn’t encounter any unnecessary outliers during the analysis. I considered reformatting the Salary range, however there was no need. An additional field in the data had standardized the salary. So, no point in the extra steps! Small wins, right?

One last thing I needed Power Query to do for me… I planned on utilizing a radial chart to display my skill data. To do that in Tableau, I need a field with two values for each index number. The two values, 0 and 270, will help format the X and Y axis of the radial.

Since I’m already using Excel, I add a column called ‘Path’ and insert the values. Then Power Query splits the column into rows, doubling the size of my file.

I’m happy with the data and formatting, so it’s time to jump into Tableau!

Visualization

After importing the cleansed dataset into Tableau, I checked my data types and ensured that all my formatting loaded as expected. Moving on, I open a new sheet and create my first visual, a text table.

Wanting to show the average salaries, from the minimum to maximum, and overall average, I create three calculated fields for the Salary Min, Salary Max, and Salary Standardized measures.

{FIXED [Index]:MIN([Salary Max])}

This calculation is the format I used to derive the minimum salaries, using only distinct index numbers. Since my index numbers were duplicated several times over when splitting my columns up in Power Query, this step is necessary to show accurate results. After inserting the new measures into the sheet, I alter the aggregation to an average instead of sum.

Next, I open a second sheet and start developing a bar chart to show the average salary expectations, grouped by the source of the job post. However, I only want to show the top ten job boards based on average salary.

So, I insert my dimension, ‘Via’, apply the aggregation of average salary created previously, and sort the measure in descending order. I apply a filter to only show the top ten and add a label and color mark based on the number of distinct indexes, or job postings.

My third sheet is used to display a packed bubble chart, highlighting the job source, schedule type (Full-time, Contractor, etc.), and relevant companies. After setting up a custom tool tip to detail chosen dimensions and count of unique job IDs, I apply some filters and color to finish it up.

Onto my last sheet, which was the inspiration for this visual analysis. The primary dimension used is the ‘Description Tokens’, which are the desired skills listed for each job opportunity. Creating the radial chart involved many steps of creating calculated fields to lay on top of each other.

I won’t break down every step, but here’s the high-level view of what was needed:

  • Pull in the ‘Path’ column of 0’s and 270’s, converting to bins of 1 and utilizing it to create a line chart.
  • Create eight calculated fields using functions like WINDOW_MAX, WINDOW_SUM, COUNTD, RANK_UNIQUE, SIN, COS and RADIANS.
  • Added a color palette, labels with percentages, adjusted the size of the radial lines, and set a filter to only show the top skills – only skills listed on at least 1500 job opportunities.

Finally, I pulled it all together for my dashboard. With some custom sizing, a unique color scheme, added text, images, and borders, my ideal visualization begins to take shape. I decided to create an Action to align the packed bubble and radial charts when filtered. This way, as a potential job seeker filters down on the company or other metric, they can view the skills being sought after.

Conclusions

So, what did I discover? I discovered that this was a lot of fun!

Every step of the process, from data manipulation to experimenting with dashboard design had me excited about the final product.

Everyone will take something unique from the data. The further you drill-down on the filters, the more relevant the skills and other KPIs will be or won’t be to you. But for me, I feel confident that my most practiced and developed skills are sitting right at the top of those most asked for!

Behind the Scenes

This project was so much fun to put together! If you’d like a peak at my Tableau dashboard or the source data used, just click the button below and find all the references you need.

My GITHUB

Regional Supermarket Sales

Using a combination of SQL and Tableau, I take a dive into supermarket sales and inventory data over a four year period. Let's see who has risen to the top!

Learn more

COVID-19 Reporting Analysis

Thanks to the multitude of initiatives focused on understanding the impacts of the COVID-19 virus, there is no shortage of data available. So, why not start from the beginning and see how COVID has affected the hospital system and patients on a granular level?

Learn more