We are happy to announce that HDInsight Tools for VSCode now supports argparse and accepts parameter based Pyspark Job submission. We also enabled the tools to support Spark 2.2 for PySpark author and job submission.
The argparse feature grants you great flexibility for your PySpark code author, test and job submission for both batch and interactive query. You can fully enjoy the advantage of PySpark argparse, and simply keep your configuration and your job-related arguments in the Json based configuration file.
The Spark 2.2 update allows you to benefit the new functionalities and to consume the new libraries and APIs from Spark 2.2 in VSCode. You can create, author and submit a Spark 2.2 PySpark job to Spark 2.2 cluster. With the backward compatibility of Spark 2.2, you can also submit your existing Spark 2.0 and Spark 2.1 PySpark scripts to a Spark 2.2 cluster.
Summary of key new features
Argparse support – set up your arguments in Json format.
- Set up configurations: Go to command palate, choose command HDInsight: Set Configuration.
2. Set up the parameters in the xxx_hdi_settings.json file, including script to cluster, Livy configuration, Spark configuration, etc.
Spark 2.2 Support – Submit PySpark batch and interactive query to Spark 2.2 cluster.
How to install or update
First, install Visual Studio Code and download Mono 4.2.x (for Linux and Mac). Then get the latest HDInsight Tools by going to the VSCode Extension repository or the VSCode Marketplace and searching HDInsight Tools for VSCode.
For more information about HDInsight Tools for VSCode, please use the following resources:
- User Manual: HDInsight Tools for VSCode
- User Manual: Set Up PySpark Interactive Environment
- Demo Video: HDInsight for VSCode Video
- Hive LLAP: Use Interactive Query with HDInsight
If you have questions, feedback, comments, or bug reports, please use the comments below or send a note to firstname.lastname@example.org.
Source: Azure Blog Feed