dataframe Archives - Page 27 of 36 - Magenaut

Strip / trim all strings of a dataframe

August 15, 2022 by Magenaut

Cleaning the values of a multitype data frame in python/pandas, I want to trim the strings. I am currently doing it in two instructions :

pandas – filter dataframe by another dataframe by row elements

August 15, 2022 by Magenaut

I have a dataframe df1 which looks like:

Create Spark DataFrame. Can not infer schema for type

August 15, 2022 by Magenaut

Could someone help me solve this problem I have with Spark DataFrame?

Pandas Groupby and Sum Only One Column

August 15, 2022 by Magenaut

So I have a dataframe, df1, that looks like the following:

Filtering a Pyspark DataFrame with SQL-like IN clause

August 15, 2022 by Magenaut

I want to filter a Pyspark DataFrame with a SQL-like IN clause, as in

What is the difference between join and merge in Pandas?

August 15, 2022 by Magenaut

Suppose I have two DataFrames like so:

How to find which columns contain any NaN value in Pandas dataframe

August 15, 2022 by Magenaut

Given a pandas dataframe containing possible NaN values scattered here and there:

Quickest way to make a get_dummies type dataframe from a column with a multiple of strings

August 15, 2022 by Magenaut

I have a column, ‘col2’, that has a list of strings. The current code I have is too slow, there’s about 2000 unique strings (the letters in the example below), and 4000 rows. Ending up as 2000 columns and 4000 rows.

How to flatten a pandas dataframe with some columns as json?

August 15, 2022 by Magenaut

I have a dataframe df that loads data from a database. Most of the columns are json strings while some are even list of jsons. For example:

Ambiguity in Pandas Dataframe / Numpy Array “axis” definition

August 15, 2022 by Magenaut

I’ve been very confused about how python axes are defined, and whether they refer to a DataFrame’s rows or columns. Consider the code below: