About Data Janitor

Are you having to constantly do data cleaning and transformation in Excel? Are you repeating those steps over and over? Don't you wish you could save those operations and use in other Excel files easily?

Data Janitor helps you automate and save data cleaning recipes right in your browser. Data Janitor has tons of helpers to handle dates, strings and numbers.


How it Works

In Excel or Google Sheets copy an entire worksheet (Ctrl+c). In Data Janitor on the Data tab paste (Ctrl+v) that data. The data gets converted to an array of hash objects, each representing a row. The row objects will have as keys the column header names if you've toggle on the auto-detect headers option. Otherwise the keys will be the column index starting at 0.

The JavaScript process function maps data from Input to Output. Once written, you can reuse it on other data sets that uses the same logic for cleaning and transforming data.

Function process will be passed as arguments input and columns. Column headers (or indices) are passed in array columns for convenience and lookup. The process function must return an array of rows where each row is a hash. It will get displayed in the Output table. You will be able to copy or download a CSV of the output.


Helpers

Libraries Underscore.js, underscore.string and Moment.js are available when you write the process function. In addition, you can validate an email with _.isEmail(email). It will return true or false. Checkout the Tips section for common cleaning patterns.


Data Confidentiality

Data you paste and code your write in the Sandbox session is kept on your computer; in the browser's local storage. It is not uploaded to the server.

Use the Save link to save your session to the server. You can share the link to that session with co-workers or bookmark it for later.

You can delete a saved session from the server at any time. This will not delete all data from computers of people you have shared it with. You will need to ask them to delete the saved session as well.


Security

The JavaScript function is run in a sandbox environment using a web worker. This prevents malicious code from running on your computer. It also allows you to stop processing in case there is an infinit loop within the process function.


BETA Software

Data Clean is new and still in BETA. If you find bugs or have suggestions, please open a GitHub issue.


Change Log

Jan 26, 2019

Removed 64k limit on download button.

Dec 31 2018

Added ability to name saved sessions.

Dec 25 2018

Expose sessions in UI.

Dec 19 2018

Added ability to save session and request service.

Nov 11 2018

Initial BETA release.