Revision e73ede7939f2005fe58691d9db0858df65e202fe authored by Jiaheng Tang on 26 July 2024, 01:44:05 UTC, committed by Wenchen Fan on 26 July 2024, 01:44:05 UTC
### What changes were proposed in this pull request?

Support listColumns API for clustering columns.
### Why are the changes needed?

Clustering columns should be supported, just like partition and bucket columns, for listColumns API.
### Does this PR introduce _any_ user-facing change?

Yes, listColumns will now show an additional field `isCluster` to indicate whether the column is a clustering column.
Old output for `spark.catalog.listColumns`:
```
+----+-----------+--------+--------+-----------+--------+
|name|description|dataType|nullable|isPartition|isBucket|
+----+-----------+--------+--------+-----------+--------+
|   a|       null|     int|    true|      false|   false|
|   b|       null|  string|    true|      false|   false|
|   c|       null|     int|    true|      false|   false|
|   d|       null|  string|    true|      false|   false|
+----+-----------+--------+--------+-----------+--------+
```

New output:
```
+----+-----------+--------+--------+-----------+--------+---------+
|name|description|dataType|nullable|isPartition|isBucket|isCluster|
+----+-----------+--------+--------+-----------+--------+---------+
|   a|       null|     int|    true|      false|   false|    false|
|   b|       null|  string|    true|      false|   false|    false|
|   c|       null|     int|    true|      false|   false|    false|
|   d|       null|  string|    true|      false|   false|    false|
+----+-----------+--------+--------+-----------+--------+---------+
```

### How was this patch tested?

New unit tests.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #47451 from zedtang/list-clustering-columns.

Authored-by: Jiaheng Tang <jiaheng.tang@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
1 parent f3b819e
History

README.md

back to top