Azure HDInsight

1 answer

How can I make user defined parameters required inside a pipeline

If I have a parameter that I defined, how can I make it required like this

asked

N2120 81

answered

N2120 81

1 answer

HDInsight cluster creation- error on configuration+pricing

Hi, I have been trying to create an HDInsight Cluster from Canada, but it fails at the configuration step. I subscribed for PAYG, I tried by selecting different nodes but none of them is working it gives me the error " You have reached your…

asked

ANJU RANI 1

commented

PRADEEPCHEEKATLA-MSFT 88,716 Microsoft Employee

1 answer

How do I find Hive server information

I'm trying to create a pipeline to copy data from csv to a DataBricks table. To do so, I believe I need to set up a HIVE linked service. However, I'm not sure where I can find the necessary information to fill out the LS form - we had a…

asked

Peter Ott 1

commented

PRADEEPCHEEKATLA-MSFT 88,716 Microsoft Employee

0 answers

Azure HDInsight Spark job is failing with Logger Error

Hello Team, Our jobs are recently failing with this error- ERROR RawSocketSender [MdsLoggerSenderThread]: org.fluentd.logger.sender.RawSocketSender java.net.SocketException: Broken pipe (Write failed) All these pyspark jobs were running fine…

asked

Aishwarya Gopikrishnan 1

commented

MartinJaffer-MSFT 26,081

1 answer

Real Case Scenarios

Hello, Where can I find case scenarios or real life use cases of for example cloud models or high availability and scalability. What I mean is, for example, hybrid cloud is used by banks because they want to control the database and security. …

asked

Rawan Ghalayini 21

commented

ShaktiSingh-MSFT 15,056

1 answer

How to use UA Managed Identity in Data factory On Demand HD Insight Linked Service

When creating an on-demand HD Insight linked service, there's missing detail for how to configure a User Assigned managed identity instead of a service principal. Steps are shown on how to add a UA managed identity to the Data Factory, but what values…

asked

Scoot-3223 91

commented

Bhargava-MSFT 30,891 Microsoft Employee

2 answers

Spark Dataframe writing issue in azure from spark: One of the request inputs is not valid

I am able to read data from azure blob storage but when writing back to azure storage then it throws below error . I am running this program in my local machine. Can someone help me out on this please. Program val config = new SparkConf(); …

asked

Abdul Hafiz A.G A ID(RITM0203509) 11

answered

Junjie Cao 1 Microsoft Employee

0 answers

How to Add a subqueue in yarn

I already have queues setup on Yarn on HdInsight, they were setup with the Ambari UI. I have a queue for sqoop that takes up 70% of the cluster. However I have a few huge sqoop jobs with a lot of mappers that take up 100% of the queue and block…

asked

Vamsi Anamaneni 1

commented

HimanshuSinha-msft 19,471 Microsoft Employee

1 answer

HDInsight HBase vs Databricks

Hi, this is probably answered or perhaps a tall question. What would be the difference/benefits between using HDInsight HBase vs Databricks. Azure storage is definitely one. If the aim is to have the convenience of traditional table & sql with…

asked

AKM 1

commented

PRADEEPCHEEKATLA-MSFT 88,716 Microsoft Employee

1 answer

To use Azure Data Lake Storage Gen2 with Azure HDInsight clusters, do I have to attach the storage to clusters as linked additional storage?

To use Azure Data Lake Storage Gen2 with Azure HDInsight clusters, do I have to attach the storage to clusters as linked additional storage? Or as long as permission is granted to managed identity, spark scala application could access the storage using…

asked

Summer 1

commented

PRADEEPCHEEKATLA-MSFT 88,716 Microsoft Employee

2 answers

Creating a HDInsight Spark 4.0 cluster with managed identity and a Data Lake Store gen 2 storage account

Hi! I am trying to create a HDI cluster with an ADLS Gen2 storage account as primary storage account. I have created multiple containers inside my storage account, and I want to limit the access of the managed identity to the other containers. …

asked

LHIND, CARSTEN 1

commented

PRADEEPCHEEKATLA-MSFT 88,716 Microsoft Employee

1 answer

HDInsights cluster is in the Error status, even when the user assigned Managed Identity is assigned a role as Storage Blob Data Owner.

asked

Akash Chopra 36

commented

PRADEEPCHEEKATLA-MSFT 88,716 Microsoft Employee

1 answer

HD insight Cluster, Worker node, E32_V3 (256 GB), memory issue

HD insight Cluster - Worker node - E32_V3 (256 GB). It is showing 911 GB memory on ambari portal. MS document for E32_v3 of HD insight worker node showing 1600 GB space. Why there is a discrepancy.

asked

Thakur, Prabhat 81

accepted

Thakur, Prabhat 81

1 answer

Unable to create HDInsight cluster through free azure subscription

There are not enough cores available to support the selected number of nodes. Please adjust the number of nodes selected, pick a different region, or open a support case to request additional HDInsight cores. You have reached your subscription's…

asked

Vrunda 21

accepted

Vrunda 21

1 answer

azure hdinsight There are not enough cores available

I want to create HDInsight in my pay as you go subscription, but I get error: There are not enough cores available to support the selected number of nodes. I checked in my subscription usage and quotas for computing and usage is for every processor…

asked

Ales Ventus 46

commented

PRADEEPCHEEKATLA-MSFT 88,716 Microsoft Employee

1 answer

Files not getting saved in Azure blob using Spark in HDInsights cluster

We've setup HDInsights cluster on Azure with Blob as the storage for Hadoop. We tried uploading files to the Hadoop using hadoop CLI and the files were getting uploaded to the Azure Blob. Command used to upload: Hadoop fs -put somefile…

asked

Saif Ahmad 21

commented

Saif Ahmad 21

1 answer

Connect Synapse Spark Pool with Kafka on HDInsight

I have created a Kafka on HDinsight cluster . I have also created Azure Synapse Analytics - Spark Pool on same region as HDinsight. I need guidance on how to consume topics from Kafka into Spark Structured Streaming. Any documentation or steps will be of…

asked

sql-seek 61

accepted

sql-seek 61

1 answer

HDInsight - Kafka - Version 3.2

Hi all Is there a roadmap to release a cluster with a higher kafka version than 2.4.1 in the near future? Thanks for the info in advance. Best reagrads, Michael

asked

Michael Ahrens 21

accepted

Michael Ahrens 21

1 answer

Can Azure Streaming Analytics read from Kafka on HDInsight and write to Deltalake table on Synapse lake.

Hello I am looking for guidance on building a new event driven platform. The options we are exploring for processing are - Azure Stream Analytics Apache Spark Structured Streaming in Synapse Source is like going to be Kafka on HDInsight …

asked

sql-seek 61

commented

sql-seek 61

1 answer

what is the best way to copy data from my hadoop on prem cluster to the azure hdinsight cluster?

hi experts, what is the best way to copy data from my hadoop on prem cluster to the azure hdinsight cluster? So we recently deployed a new hdinsight cluster and now I would like to copy some data from my onprem cluster to hdinsight. Thanks,

asked

Richmond Yu 1

commented

PRADEEPCHEEKATLA-MSFT 88,716 Microsoft Employee

Filter

Content

210 questions with Azure HDInsight tags

How can I make user defined parameters required inside a pipeline

HDInsight cluster creation- error on configuration+pricing

How do I find Hive server information

Azure HDInsight Spark job is failing with Logger Error

Real Case Scenarios

How to use UA Managed Identity in Data factory On Demand HD Insight Linked Service

Spark Dataframe writing issue in azure from spark: One of the request inputs is not valid

How to Add a subqueue in yarn

HDInsight HBase vs Databricks

To use Azure Data Lake Storage Gen2 with Azure HDInsight clusters, do I have to attach the storage to clusters as linked additional storage?

Creating a HDInsight Spark 4.0 cluster with managed identity and a Data Lake Store gen 2 storage account

HDInsights cluster is in the Error status, even when the user assigned Managed Identity is assigned a role as Storage Blob Data Owner.

HD insight Cluster, Worker node, E32_V3 (256 GB), memory issue

Unable to create HDInsight cluster through free azure subscription

azure hdinsight There are not enough cores available

Files not getting saved in Azure blob using Spark in HDInsights cluster

Connect Synapse Spark Pool with Kafka on HDInsight

HDInsight - Kafka - Version 3.2

Can Azure Streaming Analytics read from Kafka on HDInsight and write to Deltalake table on Synapse lake.

what is the best way to copy data from my hadoop on prem cluster to the azure hdinsight cluster?