Revision - e46d2e2 - [SPARK-40270][PS] Make 'compute.max_rows' as None working in [...] - origin: https://github.com/apache/spark

visit type:

https://github.com/apache/spark

05 April 2024, 20:24:39 UTC

Revision e46d2e2d476e85024f1c53fdaf446fdd2e293d28 authored by Hyukjin Kwon on 30 August 2022, 07:25:26 UTC, committed by Hyukjin Kwon on 30 August 2022, 07:26:24 UTC

[SPARK-40270][PS] Make 'compute.max_rows' as None working in DataFrame.style

This PR make `compute.max_rows` option as `None` working in `DataFrame.style`, as expected instead of throwing an exception., by collecting it all to a pandas DataFrame.

To make the configuration working as expected.

Yes.

```python
import pyspark.pandas as ps
ps.set_option("compute.max_rows", None)
ps.get_option("compute.max_rows")
ps.range(1).style
```

**Before:**

```
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/.../spark/python/pyspark/pandas/frame.py", line 3656, in style
    pdf = self.head(max_results + 1)._to_internal_pandas()
TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'
```

**After:**

```
<pandas.io.formats.style.Styler object at 0x7fdf78250430>
```

Manually tested, and unittest was added.

Closes #37718 from HyukjinKwon/SPARK-40270.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 0f0e8cc26b6c80cc179368e3009d4d6c88103a64)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>

1 parent c9710c5

Files
Changes

Permalinks

Tip revision: e46d2e2d476e85024f1c53fdaf446fdd2e293d28 authored by Hyukjin Kwon on 30 August 2022, 07:25:26 UTC
[SPARK-40270][PS] Make 'compute.max_rows' as None working in DataFrame.style

Tip revision: e46d2e2

File	Mode	Size
.github
.idea
R
assembly
bin
binder
build
common
conf
core
data
dev
docs
examples
external
graphx
hadoop-cloud
launcher
licenses
licenses-binary
mllib
mllib-local
project
python
repl
resource-managers
sbin
sql
streaming
tools
.asf.yaml	-rw-r--r--	1.1 KB
.gitattributes	-rw-r--r--	130 bytes
.gitignore	-rw-r--r--	2.0 KB
CONTRIBUTING.md	-rw-r--r--	997 bytes
LICENSE	-rw-r--r--	13.1 KB
LICENSE-binary	-rw-r--r--	22.4 KB
NOTICE	-rw-r--r--	2.0 KB
NOTICE-binary	-rw-r--r--	56.5 KB
README.md	-rw-r--r--	4.4 KB
appveyor.yml	-rw-r--r--	2.7 KB
pom.xml	-rw-r--r--	137.4 KB
scalastyle-config.xml	-rw-r--r--	22.0 KB

Showing with 0 additions and 0 deletions (0 / 0 diffs computed)

Computing file changes ...

https://github.com/apache/spark

[SPARK-40270][PS] Make 'compute.max_rows' as None working in DataFrame.style

README.md