Trending questions in Apache Spark

0 votes
1 answer

How to restrict a group to only view in Spark?

You can do it dynamically be setting ...READ MORE

Mar 15, 2019 in Apache Spark by Raj
729 views
0 votes
1 answer

How to add modify access for Web UI user?

For a user to have modification access ...READ MORE

Mar 14, 2019 in Apache Spark by Raj
765 views
0 votes
1 answer

Multidimensional Array in Scala

Multidimensional array is an array which store ...READ MORE

Feb 11, 2019 in Apache Spark by Omkar
• 69,220 points
2,096 views
0 votes
1 answer

Enable encryption for local Input and Output

You can enable local I/O encryption like ...READ MORE

Mar 14, 2019 in Apache Spark by Raj
760 views
0 votes
1 answer

How to disable broadcast checksum?

Run the following in the Spark shell: val ...READ MORE

Mar 9, 2019 in Apache Spark by Siri
937 views
0 votes
1 answer

How to disable existing directory check?

To disable this, run the below commands: val ...READ MORE

Mar 10, 2019 in Apache Spark by Siri
898 views
0 votes
1 answer

How to make driver update metrics quickly to executor?

There's a heartbeat signal sent to the ...READ MORE

Mar 10, 2019 in Apache Spark by Siri
898 views
0 votes
1 answer

How to increase the amount of data to be transferred to shuffle service at the same time?

The amount of data to be transferred ...READ MORE

Mar 1, 2019 in Apache Spark by Omkar
• 69,220 points
1,218 views
0 votes
1 answer

What port the Spark dashboard run on?

Spark dashboard by default runs on port ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
964 views
0 votes
1 answer

Spark shuffle service port number

The default port that shuffle service runs ...READ MORE

Mar 1, 2019 in Apache Spark by Omkar
• 69,220 points
1,167 views
0 votes
1 answer

Companion objects in Scala

When a singleton object is named the ...READ MORE

Feb 24, 2019 in Apache Spark by Uma
1,408 views
0 votes
1 answer

Key Factor Algorithms used for encryption.

The default key factor algorithm used is PBKDF2WithHmacSHA1. You ...READ MORE

Mar 13, 2019 in Apache Spark by Venu
635 views
0 votes
1 answer

Components of Spark

Spark core: The base engine that offers ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
843 views
0 votes
1 answer

Prevent jobs to be killed from Web UI

You need to be careful with this. ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
907 views
0 votes
1 answer

How to set time for task speculation?

By default, the check for task speculation ...READ MORE

Mar 12, 2019 in Apache Spark by Veer
644 views
0 votes
1 answer

How to increase wait time to launch data-local task?

You can increase the locality wait time ...READ MORE

Mar 11, 2019 in Apache Spark by Raj
678 views
0 votes
1 answer

Increasing retry before blacklisting a node

You can do it dynamically using the ...READ MORE

Mar 12, 2019 in Apache Spark by Raj
670 views
0 votes
1 answer

Delay requesting new executor in dynamic allocation

You can set the duration like this: val ...READ MORE

Mar 13, 2019 in Apache Spark by Venu
590 views
0 votes
1 answer

How to delay live entity updates on Spark ?

You can do this by increasing the ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
874 views
0 votes
1 answer

Spark event log location

Unless and until you have not changed ...READ MORE

Mar 6, 2019 in Apache Spark by Rohit
863 views
0 votes
1 answer

Invalid syntax in spark

There's a problem with your syntax. There ...READ MORE

Jan 31, 2019 in Apache Spark by Omkar
• 69,220 points
2,298 views
0 votes
1 answer

Changing port for Block Managers

By default, the port of which the ...READ MORE

Mar 10, 2019 in Apache Spark by Siri
668 views
0 votes
1 answer

Why is Spark map output compressed?

Spark thinks that it is a good ...READ MORE

Feb 24, 2019 in Apache Spark by Wasim
1,182 views
0 votes
1 answer

Unresolved dependency issue on sbt package command

Check if you are able to access ...READ MORE

Jan 3, 2019 in Apache Spark by Omkar
• 69,220 points
3,212 views
0 votes
1 answer

where can i get spark-terasort.jar and not .scala file, to do spark terasort in windows.

Hi! I found 2 links on github where ...READ MORE

Feb 13, 2019 in Apache Spark by Omkar
• 69,220 points
1,458 views
0 votes
2 answers

How to use RDD filter with other function?

val x = sc.parallelize(1 to 10, 2)   // ...READ MORE

Aug 17, 2018 in Apache Spark by zombie
• 3,790 points
10,089 views
0 votes
1 answer

Changing Column position in spark dataframe

Yes, you can reorder the dataframe elements. You need ...READ MORE

Apr 19, 2018 in Apache Spark by Ashish
• 2,650 points
14,183 views
0 votes
1 answer

Error using double map.

You have forgotten to mention the case ...READ MORE

Feb 11, 2019 in Apache Spark by Omkar
• 69,220 points
976 views
+1 vote
1 answer

Spark interview

Preparing for an interview? We have something ...READ MORE

Feb 7, 2019 in Apache Spark by Edureka
• 2,960 points
1,047 views
0 votes
1 answer

Error while using Spark SQL filter API

You have to use "===" instead of ...READ MORE

Feb 4, 2019 in Apache Spark by Omkar
• 69,220 points
948 views
0 votes
1 answer

Languages supported by Apache Spark?

Apache Spark supports the following four languages:  Scala, ...READ MORE

Sep 3, 2018 in Apache Spark by nitinrawat895
• 11,380 points
7,505 views
0 votes
1 answer

Query regarding a spark split logic

First, import the data in Spark and ...READ MORE

Feb 9, 2019 in Apache Spark by Omkar
• 69,220 points
603 views
–1 vote
1 answer

Not able to use sc in spark shell

Seems like master and worker are not ...READ MORE

Jan 3, 2019 in Apache Spark by Omkar
• 69,220 points
1,991 views
0 votes
1 answer

How to get ID of a map task in Spark?

you can access task information using TaskContext: import org.apache.spark.TaskContext sc.parallelize(Seq[Int](), ...READ MORE

Nov 20, 2018 in Apache Spark by Frankie
• 9,830 points
3,638 views
–1 vote
1 answer

Deciding number of spark context objects

How many spark context objects you should ...READ MORE

Jan 16, 2019 in Apache Spark by Omkar
• 69,220 points
957 views
0 votes
1 answer

Spark and Scale Auxiliary constructor doubt

println("Slayer") is an anonymous block and gets ...READ MORE

Jan 8, 2019 in Apache Spark by Omkar
• 69,220 points
869 views
0 votes
1 answer

How to add third party java jars for use in PySpark?

You can add external jars as arguments ...READ MORE

Jul 4, 2018 in Apache Spark by nitinrawat895
• 11,380 points

edited Nov 19, 2021 by Sarfaraz 8,939 views
0 votes
1 answer

Is there an API for implementing graphs in Spark?

GraphX is the Spark API for graphs and ...READ MORE

Jan 5, 2019 in Apache Spark by Frankie
• 9,830 points
963 views
0 votes
1 answer

How to open/stream .zip files through Spark?

You can try and check this below ...READ MORE

Nov 20, 2018 in Apache Spark by Frankie
• 9,830 points
2,705 views
0 votes
1 answer

Filter, Option or FlatMap in spark

If, for option 2, you mean have ...READ MORE

Nov 9, 2018 in Apache Spark by Frankie
• 9,830 points
3,019 views
+1 vote
2 answers

Apache Spark vs Apache Spark 2

Spark 2 doesn't differ much architecture-wise from ...READ MORE

Apr 24, 2018 in Apache Spark by kurt_cobain
• 9,350 points
9,602 views
+1 vote
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

May 29, 2018 in Apache Spark by Shubham
• 13,490 points
8,696 views
0 votes
1 answer

Is 'sparkline' a method?

I suggest you to check 2 things That jquery.sparkline.js is actually ...READ MORE

Nov 9, 2018 in Apache Spark by Frankie
• 9,830 points
1,516 views
0 votes
1 answer

How can I minimize data transfers when working with Spark?

Minimizing data transfers and avoiding shuffling helps ...READ MORE

Sep 19, 2018 in Apache Spark by zombie
• 3,790 points
3,374 views
0 votes
1 answer

How to find max value in pair RDD?

Use Array.maxBy method: val a = Array(("a",1), ("b",2), ...READ MORE

May 26, 2018 in Apache Spark by nitinrawat895
• 11,380 points
8,190 views
0 votes
1 answer

What are the levels of parallelism in spark streaming ?

> In order to reduce the processing ...READ MORE

Jul 27, 2018 in Apache Spark by zombie
• 3,790 points
5,244 views
0 votes
1 answer

Is there any way to check the Spark version?

There are 2 ways to check the ...READ MORE

Apr 19, 2018 in Apache Spark by nitinrawat895
• 11,380 points
9,087 views
0 votes
1 answer

When running Spark on Yarn, do I need to install Spark on all nodes of Yarn Cluster?

No, it is not necessary to install ...READ MORE

Jun 14, 2018 in Apache Spark by nitinrawat895
• 11,380 points
6,568 views
0 votes
1 answer

Difference between sparkContext, JavaSparkContext, SQLContext, & SparkSession?

Yes, there is a difference between the ...READ MORE

Jul 4, 2018 in Apache Spark by nitinrawat895
• 11,380 points
5,538 views
+1 vote
2 answers

Hadoop 3 compatibility with older versions of Hive, Pig, Sqoop and Spark

Hadoop 3 is not widely used in ...READ MORE

Apr 20, 2018 in Apache Spark by kurt_cobain
• 9,350 points
6,358 views