A dataset containing 23 categorical properties of 23 different species of gilled mushrooms including a categorization if it is edible or not.

mushroom

Format

A data frame with 8124 rows and 23 columns:

bruises

bruises no

cap-color

brown yellow white gray red pink buff purple cinnamon green

cap-shape

convex bell sunken flat knobbed conical

cap-surface

smooth scaly fibrous grooves

edible

poisonous edible

gill-attachment

free attached

gill-color

black brown gray pink white chocolate purple red buff green yellow orange

gill-size

narrow broad

gill-spacing

close crowded

habitat

urban grasses meadows woods paths waste leaves

odor

pungent almond anise none foul creosote fishy spicy musty

population

scattered numerous abundant several solitary clustered

ring-number

one two none

ring-type

pendant evanescent large flaring none

spore-print-color

black brown purple chocolate white green orange yellow buff

stalk-color-above-ring

white gray pink brown buff red orange cinnamon yellow

stalk-color-below-ring

white pink gray buff brown red yellow orange cinnamon

stalk-root

equal club bulbous rooted NA

stalk-shape

enlarging tapering

stalk-surface-above-ring

smooth fibrous silky scaly

stalk-surface-below-ring

smooth fibrous scaly silky

veil-color

white brown orange yellow

veil-type

partial

Source

https://archive.ics.uci.edu/ml/datasets/Mushroom

Details

The records are drawn from G. H. Lincoff (1981) (Pres.), The Audubon Society Field Guide to North American Mushrooms. New York: Alfred A. Knopf. (See pages 500--525 for the Agaricus and Lepiota Family.)

The Guide clearly states that there is no simple rule for determining the edibility of a mushroom; no rule like “leaflets three, let it be” for Poisonous Oak and Ivy.

The actual dataset from the UCI repository has been cleaned up to properly label the missing values and have the full category names instead of their abbreviations.

References

C. Ahlmann-Eltze and C. Yau, "MixDir: Scalable Bayesian Clustering for High-Dimensional Categorical Data", 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), Turin, Italy, 2018, pp. 526-539.

Blake, C.L. & Merz, C.J. (1998). UCI Repository of Machine Learning Databases. Irvine, CA: University of California, Department of Information and Computer Science.

Examples

data("mushroom") summary(mushroom)
#> bruises cap-color cap-shape cap-surface #> Length:8124 Length:8124 Length:8124 Length:8124 #> Class :character Class :character Class :character Class :character #> Mode :character Mode :character Mode :character Mode :character #> edible gill-attachment gill-color gill-size #> Length:8124 Length:8124 Length:8124 Length:8124 #> Class :character Class :character Class :character Class :character #> Mode :character Mode :character Mode :character Mode :character #> gill-spacing habitat odor population #> Length:8124 Length:8124 Length:8124 Length:8124 #> Class :character Class :character Class :character Class :character #> Mode :character Mode :character Mode :character Mode :character #> ring-number ring-type spore-print-color #> Length:8124 Length:8124 Length:8124 #> Class :character Class :character Class :character #> Mode :character Mode :character Mode :character #> stalk-color-above-ring stalk-color-below-ring stalk-root #> Length:8124 Length:8124 Length:8124 #> Class :character Class :character Class :character #> Mode :character Mode :character Mode :character #> stalk-shape stalk-surface-above-ring stalk-surface-below-ring #> Length:8124 Length:8124 Length:8124 #> Class :character Class :character Class :character #> Mode :character Mode :character Mode :character #> veil-color veil-type #> Length:8124 Length:8124 #> Class :character Class :character #> Mode :character Mode :character