Release history

May 04, 2021 - 1.24.0

New Features

  • Resize the width of a column by double clicking on the right edge of it's column header

  • You can also resize all column at the same time by clicking in the dataset and typing ctrl+shift+a

  • Python 3.9 support: We now support Python 3.9 (well, that was about time! ;))

March 26, 2021 - 1.23.0

New Features

  • Export transformation descriptions, i.e. for each pandas code we write for you, we also add a little comment line describing what the code does in plain english. You can enable that feature via the config by running bam.set_option("global.export_transformation_descriptions", True) in your Jupyter cell.

  • Users can now request features from within bamboolib!

Feb 19, 2021 - 1.22.2

Improvements

  • Improve plugins API

Feb 12, 2021 - 1.22.1

Improvements

  • Add human readable error message when no aggregation is specified in Groupby Transformation

Bug Fixes

  • Fix warning messages created by sklearn

  • Fix bug about guessing the dataframe name

Feb 08, 2021 - 1.22.0 - Free community version ๐ŸŽ‰

We are really excited to announce that bamboolib is ready for the general public after months of intense user feedback and production iterations with our private beta customers.

Our mission is to empower every professional to perform Data Analysis in Python - without the burden of becoming a programmer or the hassle of googling syntax all day long. The key stone towards the mission is the bamboolib GUI that exports Python code. Similar to recording macros in Excel.

But that's not all. In many companies, there are routine analyses that are highly specific to the company and that have to be performed over and over again. Previously, such analyses have often been stored as internal Python libraries - created by and accessible only to the programmers. With bamboolib, those features can be quickly added to the GUI via simple Python plugins. Our existing customers use this to access internal databases, perform data cleaning or even to create complex machine learning models. All accessible to everyone in the company - releasing the programmer bottleneck in the company.

This frees developers from performing routine ETL tasks or creating standard plots for their colleagues and enables the colleagues to access the information they need whenever they need it in a self-service way.

But we are not only serving companies. There are millions of professionals, learners, students, and hobbyists who are willing to access the power of Python. Unfortunately, many of them have been locked out by the steep learning curve when trying to become a programmer. For all those people, we are offering our free community version that makes it easy to analyze their data and create reproducible reports. Providing them with the tools they need to bridge the gap to the fantastic world of open-source software.

We love the open-source ecosystem and everything we do is targeted towards making the open-source ecosystem stronger and more accessible. Both by trying to open-source as many of our internal projects as we can - e.g. ppscore and pyforest - and mainly by trying to build software that enhances the impact of open-source. There is still a long way to go but we will not stop until everyone can access the full power of open-source software.

Thank you for all your support and for all further information, please visit our website.

Feb 02, 2021 - 1.21.0

Features

  • New plot ๐Ÿ“Š - Treemap

  • Create a copy of your data ๐Ÿ’พ

    Sometimes, you want to keep a copy of the original data before you start transforming it. With the

  • Create a copy of your column ๐Ÿ’พ

Other improvements

  • "Explore data" also supports columns that contain list objects and dictionary objects.

Jan 18, 2021 - 1.20.0

Features

  • Bulk Change Data Types โ—ผ

    Change the data type of multiple columns in one go.

Change data type of all columns to String

Other Improvements

  • All dropdowns can handle very long column names

  • Plot Creator:

    • Sort x-axis by category name in histogram and bar chart.

    • Make x-axis categorical (handy when you want to make the bars in a histogram have some space in between).

    • Rename the x-axis. You can do that by searching "Rename column label" when adding a property to the Plot Creator.

  • Renamed the "Find and Replace" transformations so that they are easier to find and distinguish

Bug Fixes

  • Fixed a bug that caused bi-variate plots to crash when missing values needed to be dropped

  • Make ppscore handle Int8

  • Sometimes, the scrollbar hid the last row in the interactive data table view. We made sure that users can always see the full table.

Jan 04, 2021 - 1.19.1 - Happy new year everyone ๐ŸŽ†

Features

  • Read Excel Files ๐Ÿ“‘ When typing import bamboolib as bam bam you now can also read in Excel spreadsheets

  • Major PlotCreator Upgrade ๐ŸŽ‰ Over the last months, we've developed a new Plotting interface that makes the PlotCreator much more powerful. You can now almost change every bit of your plotly plots, including the color theme, the legend title, axis names, and tick distance.

  • New plot ๐Ÿ“Š- Candlestick chart

  • New transformations ๐Ÿงน

    • Drop a column if it contains at least one missing values or if it contains only missing values

    • Calculate the cumulative product of values in a column

    • Calculate the percentage change of values in a column

Other Improvements

  • We improved the user experience for all text transformations. You can now search for them directly in the search bar (just type "text" to see all text transformations). Also, most text transformations now allow selecting multiple columns and renaming the result for a significant speedup.

Dec 12, 2020 - 1.19.0

Internal developments for our corporate clients and improvements to our architecture.

Oct 09, 2020 - 1.18.1

Bug fixes ๐Ÿ› 

  • Fixed a bug that caused an error when loading in CSV files on Windows

  • After installing bamboolib, plotly wasn't enabled inside the GUI.

Sep 28, 2020 - 1.18.0

Features

  • Clean column names of your dataframe ๐Ÿงน

    Do you have messy column names (blanks and punctuation everywhere) and want to keep them in one clean format? "Clean column names" transforms your column names into one clean format.

  • Load Plugin support ๐Ÿ’พ

    Write loader plugins yourself and add them to bamboolib so that you can read in any data source you need. There are no limits. Read in an excel file or distributed data from a Hadoop cluster. If you can express it in Python, you can call it via GUI from bamboolib.

  • View Plugin support ๐Ÿ“Š

    Users can now write plugins that display something to the end user without altering the underlying dataframe. This can be user for creating company specific clustering plots or for conducting and displaying table results of statistical tests.

Bug fixes & Other improvements

  • We improved the layout of our explore dataframe features to make it look cleaner.

  • Fixed a bug that caused the univariate summary to throw an unknown semantic datatype Error when faced with a boolean column that only has one unique value.

Sep 4, 2020 - 1.17.1

Features

  • Data Loader - Read in CSVs ๐Ÿ“‚

    Read in CSV files using bamboolib without having to write any code

Other Improvements and Bug Fixes

  • Added bamboolib.Text and Button to plugin docs.

  • Check if the resulting dataframe name of a transformation is valid python variable name and raise a human-readable error it it isn't.

  • Fix bivariate plots bugs that led to errors when e.g. columns contained missing values

Aug 20, 2020 - 1.16.1

We just couldn't hold back and had to crank out another feature this week!

With the Bivariate Time Series Plot, you can quickly plot your variables across time, and choose any granularity level you want. Inspect your average sales by year, quarter, months, etc.

Aug 17, 2020 - 1.16.0

Features

  • Change datetime frequency โฑ

    Either expand your time series column and fill it with values (grid expand) or do a groupby and calculate aggregations. This corresponds to pandas' resample functionality.

  • Univariate Time series summary - quality bar and new summary plot ๐Ÿ“Š

    We added a data quality bar to the univariate summary of datetime columns and created a new plot which allows you to see the value counts for different aggregation levels of your time series (e.g. day, month, or year).

  • LabelEncoder ๐Ÿ”ข

    Do you have text data that you want to transform into numeric labels? The LabelEncoder lets you do exactly that!

  • PlotCreator - Add multiple y's to your plots ๐Ÿ“ˆ

    Select multiple y-Axes for your plots.

Improvements & Bug Fixes

  • In line plots, plotly draws lines by connecting the observations in your data in the order of their appearance in the dataframe - from top to bottom. This is a feature of plotly that allows you to draw circles or other geometric shapes. However, when working with bamboolib, users just want to have a time series like plot. That's what we ensure now by sorting the x-Axis.

  • With pandas 1.0+ came the new NA type for missing values. We fixed a bug that cause the PlotCreator to throw an error when confronted with pd.NA.

Aug 03, 2020 - 1.15.0

Features

  • Select Columns for preview โžก Support 10,000+ columns dataframes ๐Ÿ’ช

    Show a preview of your data on a selected subsample of your columns for more convenience and better performance when working with very (VERY) wide dataframes. This feature is also accessible via the glimpse, predictor patterns and correlation matrix.

  • Sampling columns on large data ๐Ÿงน

    When working with more than 100k rows, some interactive plots can become quite slow. In order to save you (and your computer), we randomly sample 100k rows if your data exceeds the row limit. But don't be afraid! With a click of a button, you can turn the sampling off.

  • Config option: undo_levels ๐Ÿ”ง

    Having a history of your transformations is handy, but it comes with the cost of requiring more computer memory. If you work with large datasets, you can save memory by controlling how many steps you can go back in history or turn off the history altogether (the latter with bam.set_config("global.undo_levels", 0)).

Bug Fixes

  • Fixed a pandas 1.0 regression in "extract datetime attributes".

July 24, 2020 - 1.14.1

Features

  • Combine Dataframes โžกโ€‹โฌ…

    Choose to stack dataframes on top of each other or to combine them side-by-side

Bug fixes

  • Fixed a pandas 1.0 regression problem that caused numeric to numeric plots to break

July 20, 2020 - 1.14.0

This release comes with two major supports ๐Ÿฅณ

  • Support pandas 1.0+๐Ÿผโ€‹

  • Support JupyterLab 2.0+ ๐Ÿ›ฐ

Also, we show more human-friendly errors in the 'Join' transformation and the 'PivotTable' feature.

July 7, 2020 - 1.13.1

This release makes two small adjustments:

  • Show human-friendly errors in the 'New Column Formula' transformation when the inputs are empty.

  • Adjust the code to prevent an error when the 'Pivot Table' returns a pandas.Series and not a pandas.DataFrame.

July 4, 2020 - 1.13.0

In this release, we fixed a couple of bugs and prepared the infrastructure for some of the advanced enterprise features like custom plugins.

June 16, 2020 - 1.12.0

Features

  • Interactive Histogram 2.0 ๐Ÿ‹

    We released an improved version of our interactive histogram in the Explorer. Zoom into your data with drag & drop or set ranges using the input fields, undo the last zooms or reset the zoom altogether, and get the code of the histogram!

  • Search on Tab ๐Ÿ”Ž

    If you open bamboolib, you can directly focus the search bar "Search transformations" via hitting Tab.

  • Drop duplicates ๐Ÿงน

    Do you have any duplicates in your data set? Simply drop them and decide whether you want to keep the first or last of the duplicates in your data.

Improvements & Bug Fixes

  • bamboolib normalizes data frames also when the index column is un-named.

  • Fixed a bug that made scatter plots not appear in the Explorer

  • Fixed a bug that made "Create New Column Formula" not work when column names are substrings of other column names.

June 2, 2020 - 1.11.0

With this release, we also did a lot of house-keeping.

Features

  • Glimpse 2.0๐Ÿ’ซโ€‹

    Meet our new glimpse - the main entry point to the Explore DataFrame feature. The new glimpse loads asymptotically and asynchronously, which comes with one huge advantage: no matter how large your data, you will get an immediate visual feedback of your columns. If your data is large, the glimpse will draw a first random sample of your data and then updates itself on the full data in the background.

    Also, you can dive into the univariate summary of each column directly from the new glimpse.

Bug Fixes & Improvements

  • bamboolib shows an update notice when a new version of bamboolib is available

  • All function hints under Explore DataFrame match the name of the dataframe under use

  • Fixed a bug which made the Statistics table disappear when looking at numeric column summaries

  • Fixed a bug that made a text box lose focus in the transformation Add Python Code

May 15, 2020 - 1.10.0

Features

  • Introducing the new tab system ๐Ÿฅณ

    Users can now work inside the Plot creator, the Explorer and the Wrangler in parallel! If you click on the Plot creator or Data explorer, a new tab will open, so you can easily move between the wrangler and chart tabs. When you transform the data, the plots in the other tabs automatically adjust!

    We also used the new tab system inside the explorer already.

  • More Univariate summary info on column click ๐Ÿ”Ž

    When clicking on a column name, the user will directly be navigated to a full univariate summary tab

  • Support nullable Integers Int64, Int32, etc.๐Ÿ“›

    Ever found yourself in that weird situation where you wanted to convert a float column to integer and just nothing happens when you run the code? In most cases, that's because there are NaNs in the column you want to convert. When converting float to int, bamboolib makes sure that really happens by exploiting pandas nullable Int types.

  • Resample dataframe based on datetime column ๐Ÿ“…

    We enable users to change the frequency of time series data and allow to aggregate data based on datetime columns

Improvements

  • Improve Live Code Export experience ๐Ÿ–ฅ

    We made the live code export more look like the user actually has typed the code. Among others, we removed the bamboolib code export comment.

  • Specify string format when converting datetime to string ๐Ÿ“…

    When converting a datetime to string, you can now specify in what format you want to display the strings in your data, e.g. display 2020-01-01 as Jan 1, 2020.

  • All datatype conversion features can directly be accessed via the search barโŒš

    For example, simply type "to datetime" in the main search bar if you want to convert a column to datetime

  • Fast entering bamboolib

    After a user has clicked on the bamboolib UI button to enter the user interface, she doesn't have to click on it anymore when she inspects a dataframe. This behaviour resets on a kernel restart.

Bug Fixes

  • Fixed bug so that "Pivot Table" feature can do a count of values

April 28, 2020 - 1.9.0

Feature & Improvements

  • Filter on value in other column

    When working with numeric columns, you can specify any valid python expression to filter rows. That also includes specifying values in other columns. And or course, we provide column auto-completion in the filter. If you want that feature also for other column types, please let us know!โ€‹

  • Convert invalid input in number column to NaN

    Have numbers stored as strings in your pandas DataFrame? When coercing them to numeric datatypes, bamboolib will automatically convert any invalid input to NaN.

  • Impute missing values ๐Ÿ–Œ

    Replace missing values with fixed values or impute them with the mean, median, or mode of a column. You can also forward fill or backward fill missing values!

  • Rename column after changing dtype

    When changing the datatype of a column, you can directly rename it. Also, when changing the datatype of a column to datetime, we provide examples for format strings so that you can easier decide how you want to format your datetime column.

Bug Fixes

  • Fixed a bug that caused buttons in JupyterLab to not show the whole column names in the "Explore DataFrame" ~ columns tab when the column names were too long.

April 06, 2020 - 1.8.0

Features & Improvements

  • Python 3.8 support ๐Ÿฅณ

    Python 3.8 support is finally there! And in the future, new versions of Python will be supported much quicker.

  • Column formula improved

    We redesigned the column formula feature to be more easy to use.

  • Data wrangling speed improvement on large datasets ๐Ÿš€

    On a i7 macbook with 16GB RAM, wrangling data sets with more than 5 mio rows is now possible without any lags.

Other Improvements

  • The dimensions of the data frame are displayed in a human readable fashion

  • We added descriptions to the dropdown elements in our wrangler ("Search transformations" dropdown) and plot creator

March 20, 2020 - 1.7.0

Features

  • Drop missing values โœ‚

    With the new transformation "Dropping missing (NA) values", you can quickly drop all missing values in one or multiple columns

  • Reorder columns ๐Ÿ”

    Move columns to the front or back of your DataFrame or place them wherever you want.

  • Binning ๐Ÿšฎ

    Categorise numeric values with your new binning feature

  • Groupby / aggregate

    We added new selection options. Apply groupby functions on all columns, on all columns matching a data type or on all columns matching a regular expression.

Improvements

  • We enlarged the code box in the "Add Python code" transformation so that you can add larger snippets more easily

March 02, 2020 - 1.6.0

Features

  • Create Pivot Tables๐Ÿ“‘

    Create full fletched pivot tables, including code export.

  • Create multiple aggregations for one column faster than ever ๐Ÿ’ช

    Apply multiple aggregation functions on a column quickly through a multi-select dropdown

  • Support for categorical dtype

    Convert string columns into categorical ones to save memory.

Other improvements

  • Rename columns during groupby / aggregate

  • Quick edit the last transformation from the main control panel

February 17, 2020 - 1.5.0

Features

  • Plot Creator๐Ÿ“Šโ€‹

    We added a new plot creator that lets you quickly build interactive plotly express graphs. You can also export the code for further customization. The creator works with the most important plot figure types. In case you are missing a specific figure type, please let us know.

Other Improvements

  • Public license file for automatic activation

    We provide a public license file under ~/.bamboolib/LICENSE that can be used for automatic license activation in docker or on other computers / VMs

  • New styling๐ŸŽจ

    We replaced the close, back, and delete buttons with icons, among others

February 3, 2020 - 1.4.1

Added a README to pypi.org. It was about time! ๐Ÿ™‚

February 3, 2020 - 1.4.0

Features

  • Keyboard support - Part 2โŒจ In order to further improve the user experience when working with the keyboard, we replaced all transformation buttons from the main panel with one text input field, which means that all transformations are now fully searchable via the keyboard.

Other Improvements

  • Better experience when working with the keyboard

    When starting a transformation, the first input is always focused. Also, users can select values from the dropdown using tab.

  • Live code export

    We made improvements on the user experience based on user feedback.

  • More plugin examples We added more examples to our plugin docs, e.g. a plugin example for how to write a custom aggregation function and a plugin showing how you can extract attributes from time delta columns (If you these plugins to become part of our core functionalities, please send us an email).

  • Improved merge

    If the user selects a key for df_left and there is a key with the same name in df_right, we will automatically select that key. Also, the exported code simplifies when a merge with same keys has been carried out.

  • Text transformations (such as "filter" and "replace values") support case-sensitivity and regular expressions now.

  • "rename", "string manipulations", and "extract datetime attributes" are top-level transformations now and available via search.

  • After importing bamboolib, you don't need to call bam.enable() anymore in order to show bamboolib when printing df.

  • We removed the normalization step which made sure that the dataframe index is always a RangeIndex.

January 20, 2020 - 1.3.0๐ŸŽ‰

Features

  • Keyboard support โŒจ Transformations can now be created via typing on the keyboard - including auto-completion at every step. Mouse is still possible of course.

  • Plugin - beta๐Ÿ”Œ We started implementing a plugin architecture. Starting with this release, you can write your own custom transformations. Please note that we are in beta mode currently, so the API may change over time.

Bug Fixes

  • Fixed an issue that caused an error when filtering values in a column that contains NAs

  • Fixed some CSS specificity issues in JupyterLab

  • Fixed an issue in JupyterLab that made bamboolib multi-select dropdowns not work properly

January 10, 2020 - 1.2.1

  • some users couldn't use bamboolib for free on Kaggle. We fixed that, because we love our users :)

January 9, 2020 - 1.2.0

Features

  • Rename transformation results๐Ÿ”ค You can now rename the dataframe after a transformation (e.g. name the result of a filter "df_subset")

  • Edit last transformation โœ You can edit the last transformation when looking at the history of you transformations.

  • Increase memory efficiency with numpy dtypes ๐Ÿ’พ New support of numpy data types (e.g. int8, int16, ...) so that you can reduce the memory space of your dataframes

Other Bug Fixes & Improvements

  • Live code export now also works on Firefox

December 5, 2019 - 1.1.0

Features

  • We now support JupyterLab

Other Bug Fixes & Improvements

  • Fixed an issue that made the Copy-Code-Button not work

  • Changed the styling of our buttons

โ€‹