https://github.com/apache/spark
Revision 4e8907a110fe44733c35b18ea34a6212a4b2c2dc authored by Stefaan Lippens on 12 January 2023, 09:24:30 UTC, committed by Hyukjin Kwon on 12 January 2023, 09:25:47 UTC
See https://issues.apache.org/jira/browse/SPARK-41989 for in depth explanation

Short summary: `pyspark/pandas/__init__.py` uses, at import time,  `logging.warning()`  which might silently call `logging.basicConfig()`.
So by importing `pyspark.pandas` (directly or indirectly) a user might unknowingly break their own logging setup (e.g. when based on  `logging.basicConfig()` or related). `logging.getLogger(...).warning()`  does not trigger this behavior.

User-defined logging setups will be more predictable.

Manual testing so far.
I'm not sure it's worthwhile to cover this with a unit test

Closes #39516 from soxofaan/SPARK-41989-pyspark-pandas-logging-setup.

Authored-by: Stefaan Lippens <stefaan.lippens@vito.be>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
(cherry picked from commit 04836babb7a1a2aafa7c65393c53c42937ef75a4)
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
1 parent 1421811
History
Tip revision: 4e8907a110fe44733c35b18ea34a6212a4b2c2dc authored by Stefaan Lippens on 12 January 2023, 09:24:30 UTC
[SPARK-41989][PYTHON] Avoid breaking logging config from pyspark.pandas
Tip revision: 4e8907a

README.md

back to top