Revision - e73ede7 - [SPARK-45787][SQL] Support Catalog.listColumns for clustering columns

Revision e73ede7939f2005fe58691d9db0858df65e202fe authored by Jiaheng Tang on 26 July 2024, 01:44:05 UTC, committed by Wenchen Fan on 26 July 2024, 01:44:05 UTC

[SPARK-45787][SQL] Support Catalog.listColumns for clustering columns

### What changes were proposed in this pull request?

Support listColumns API for clustering columns.
### Why are the changes needed?

Clustering columns should be supported, just like partition and bucket columns, for listColumns API.
### Does this PR introduce _any_ user-facing change?

Yes, listColumns will now show an additional field `isCluster` to indicate whether the column is a clustering column.
Old output for `spark.catalog.listColumns`:
```
+----+-----------+--------+--------+-----------+--------+
|name|description|dataType|nullable|isPartition|isBucket|
+----+-----------+--------+--------+-----------+--------+
|   a|       null|     int|    true|      false|   false|
|   b|       null|  string|    true|      false|   false|
|   c|       null|     int|    true|      false|   false|
|   d|       null|  string|    true|      false|   false|
+----+-----------+--------+--------+-----------+--------+
```

New output:
```
+----+-----------+--------+--------+-----------+--------+---------+
|name|description|dataType|nullable|isPartition|isBucket|isCluster|
+----+-----------+--------+--------+-----------+--------+---------+
|   a|       null|     int|    true|      false|   false|    false|
|   b|       null|  string|    true|      false|   false|    false|
|   c|       null|     int|    true|      false|   false|    false|
|   d|       null|  string|    true|      false|   false|    false|
+----+-----------+--------+--------+-----------+--------+---------+
```

### How was this patch tested?

New unit tests.
### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #47451 from zedtang/list-clustering-columns.

Authored-by: Jiaheng Tang <jiaheng.tang@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>

1 parent f3b819e

Files
Changes

Permalinks

File	Mode	Size
.github
R
assembly
bin
binder
build
common
conf
connect
connector
core
data
dev
docs
examples
graphx
hadoop-cloud
launcher
licenses
licenses-binary
mllib
mllib-local
project
python
repl
resource-managers
sbin
sql
streaming
tools
ui-test
.asf.yaml	-rw-r--r--	1.3 KB
.gitattributes	-rw-r--r--	130 bytes
.gitignore	-rw-r--r--	1.9 KB
CONTRIBUTING.md	-rw-r--r--	997 bytes
LICENSE	-rw-r--r--	13.2 KB
LICENSE-binary	-rw-r--r--	21.7 KB
NOTICE	-rw-r--r--	2.0 KB
NOTICE-binary	-rw-r--r--	51.6 KB
README.md	-rw-r--r--	4.3 KB
pom.xml	-rw-r--r--	140.3 KB
scalastyle-config.xml	-rw-r--r--	25.9 KB

Showing with 0 additions and 0 deletions (0 / 0 diffs computed)

Computing file changes ...

[SPARK-45787][SQL] Support Catalog.listColumns for clustering columns

README.md