Howitworks

Data Pipeline

[1] Data Collection Using Python Selenium and Beautiful Soup libraries to build web scraping tools, our team collected from the web over 600k unique resumes and 1.1 million salary records representing over 5500 cities across 30 years. Link to Code [2] Data Transformation Data ingestion pipelines were created to standardize the data inputs from the various sources. Jobs Title Job titles vary widely from resume to resume, so we created a heuristic to process each job title into a standard format.

The Models

[1] Document Similarity Model We provide career recommendations in a ranked order based on the similarity of a job seeker’s current job experiences. Our model is powered by resume information across the US. This product is a proof of concept and contains a small subset of possible jobs (114 unique job titles). The list of jobs is derived from the availability of salary and resume data. We selected all jobs that contained at least 100 salary records in the past 5 years and 500 resume job summaries in the past 10 years.

Visualization Design

Salary Barchart Visualization | Code Built using D3.js This visualization allows the user to see salaries from the past 5 years for their state (or nationally). Salaries are grouped by any “job qualifiers” on the job titles and “no_value” indicates a job has no job qualifiers. The bars show lower quartile, upper quartile, and median. The tooltip gives additional metrics. Next Steps - Create a visualization that shows jobs with decreasing/increasing demand and salaries.

Back-End Architecture

[1] The Backend Application Link to Code The backend is a Flask application whose primary purpose is to accept job description inputs and output job similarity scores using the model. This allows us to serve up model results for consumption by the frontend application. The application is built as a docker container which can easily be tested locally and then uploaded to Google Cloud Container Registry and deployed to a Google Cloud cluster with Kubernetes.

Front-End Architecture

Link to Code [1] UI Layer The UI layer has three components: A grid layout is created using Bootstrap, User interaction logic and state is maintained using React JS; and D3.js is used for visualization. The JavaScript libraries are sourced directly from cdn, while the Ajax style communication between the UI and the frontend web server is facilitated by Axios JS library. To simplify design, user input is passed to frontend server via http extension header instead of the traditional http form POST method.

How It Works

TEST OUR PRODUCT