Read from a Collection
The Macrometa Collections Databricks Connector allows you to integrate Apache Spark with Macrometa collections, allows you to read data from Macrometa collections using Apache Spark.
- Set up your source options: - val sourceOptions = Map(
 "regionUrl" -> "<REGION_URL>",
 "apiKey" -> "apikey <API_KEY>",
 "fabric" -> "<FABRIC>",
 "collection" -> "<COLLECTION>",
 "batchSize" -> "<BATCH_SIZE>",
 "query" -> "<QUERY>"
 )
- Create a Spark session: - val spark = SparkSession.builder()
 .appName("MacrometaCollectionApp")
 .master("local[*]")
 .getOrCreate()
- Read from the Macrometa collection: - Auto infer schema:
 - val inputDF = spark
 .read
 .format("com.macrometa.spark.collection.MacrometaTableProvider")
 .options(sourceOptions)
 .load()- User defined schema:val userSchema = new StructType().add("value", "string")
 val inputDF = spark
 .read
 .format("com.macrometa.spark.collection.MacrometaTableProvider")
 .options(sourceOptions)
 .schema(userSchema)
 .load()
 
- Show the read results (only 20 rows): - inputDF.show()
- Perform transformations on the DataFrame. The code block below assumes that the source collection has 'value' as a property for each document. Replace them with your own schema. - val modifiedDF = inputDF
 .select("value")
 .withColumnRenamed("value", "number")
 .withColumn("randomNumber", rand())