Skip to content

Magenaut

  • Home
  • Topics
    • Notes
    • Tutorial
    • Bug fixing
    • Extension
    • Server
  • Q&A
  • Privacy Policy
  • About

apache-spark

Rename nested field in spark dataframe

August 16, 2022 by Magenaut

Having a dataframe df in Spark:

Categories Python, Q&A Tags apache-spark, dataframe, pyspark, python, rename Leave a comment

Shipping Python modules in pyspark to other nodes

August 15, 2022 by Magenaut

How can I ship C compiled modules (for example, python-Levenshtein) to each node in a Spark cluster?

Categories Python, Q&A Tags apache-spark, python Leave a comment

Pyspark: explode json in column to multiple columns

August 15, 2022 by Magenaut

The data looks like this –

Categories Python, Q&A Tags apache-spark, apache-spark-sql, pyspark, python Leave a comment

Spark iteration time increasing exponentially when using join

August 15, 2022 by Magenaut

I’m quite new to Spark and I’m trying to implement some iterative algorithm for clustering (expectation-maximization) with centroid represented by Markov model. So I need to do iterations and joins.

Categories Python, Q&A Tags apache-spark, iteration, loops, pyspark, python Leave a comment

AttributeError: ‘DataFrame’ object has no attribute ‘map’

August 15, 2022 by Magenaut

I wanted to convert the spark data frame to add using the code below:

Categories Python, Q&A Tags apache-spark, apache-spark-mllib, pyspark, python, spark-dataframe Leave a comment

Python worker failed to connect back

August 15, 2022 by Magenaut

I’m a newby with Spark and trying to complete a Spark tutorial:
link to tutorial

Categories Python, Q&A Tags apache-spark, local, pyspark, python, windows Leave a comment

Create a custom Transformer in PySpark ML

August 15, 2022 by Magenaut

I am new to Spark SQL DataFrames and ML on them (PySpark).
How can I create a custom tokenizer, which for example removes stop words and uses some libraries from nltk? Can I extend the default one?

Categories Python, Q&A Tags apache-spark, apache-spark-ml, nltk, pyspark, python Leave a comment

Create Spark DataFrame. Can not infer schema for type

August 15, 2022 by Magenaut

Could someone help me solve this problem I have with Spark DataFrame?

Categories Python, Q&A Tags apache-spark, apache-spark-sql, dataframe, pyspark, python Leave a comment

Filtering a Pyspark DataFrame with SQL-like IN clause

August 15, 2022 by Magenaut

I want to filter a Pyspark DataFrame with a SQL-like IN clause, as in

Categories Python, Q&A Tags apache-spark, dataframe, pyspark, python, sql Leave a comment

Spark RDD to DataFrame python

August 15, 2022 by Magenaut

I am trying to convert the Spark RDD to a DataFrame. I have seen the documentation and example where the scheme is passed to
sqlContext.CreateDataFrame(rdd,schema) function.

Categories Python, Q&A Tags apache-spark, pyspark, python, spark-dataframe Leave a comment
Older posts
Newer posts
← Previous Page1 … Page3 Page4 Page5 Page6 Next →
  1. michealSmith07 on Is there a way to dynamically refresh the less command?August 21, 2022

    That is a very nice post. I like this post.

  2. anonymous on Fix libwacom9 dependency issue when upgrade DebianJune 27, 2022

    saved my day!! Thanks for the help…

  3. sreedhar on Fix libwacom9 dependency issue when upgrade DebianMay 10, 2022

    Thanks its working

  4. saintnick on Fix libwacom9 dependency issue when upgrade DebianMay 10, 2022

    remove libwacom2 worked for me as well

  5. ranafoul on Fix libwacom9 dependency issue when upgrade DebianApril 22, 2022

    apt remove libwacom2 helped on kali 2022.01. gr8

.net ajax asp.net asp.net-core asp.net-mvc asp.net-mvc-3 asp.net-mvc-4 asp.net-web-api bash c# command-line css custom-post-types custom-taxonomy dataframe dictionary django entity-framework functions gridview html iis javascript jquery json linux list matplotlib numpy pandas php plugin-development plugins posts python python-2.7 python-3.x security shell shell-script sql string vb.net webforms wp-query

© 2026 Magenaut • Built with GeneratePress