绍尔-谢拉赫引理

组合数学和极值集合论（英语：extremal set theory）中，绍尔-谢拉赫引理（英语：Sauer–Shelah lemma）断言，若集合族的VC维低，则该族不能有太多个集合。引理得名于诺贝特·绍尔^[1]和萨哈龙·谢拉赫（英语：Saharon Shelah）^[2]，两人分别独立于1972年发表此结果。^{[注 1]}较之略早，在1971年，弗拉基米尔·瓦普尼克和亚历克塞·泽范兰杰斯合著的论文^[6]已有此结果（“VC维”即以两人为名）。谢拉赫发表引理时，亦归功于米哈·佩尔莱斯（英语：Micha Perles），故引理又称为佩尔莱斯-绍尔-谢拉赫引理。^[7]

布萨格洛等人称其为“关于VC维的最根本结论之一”^[7]。引理在若干方面有应用，就各发现者的研究动机而言，绍尔是研究集族的组合学，谢拉赫研究模型论，VC二氏则研究统计学。后来，引理亦用于离散几何^[8]和图论^[9]。

定义及叙述

设 $\textstyle {\mathcal {F}}=\{S_{1},S_{2},\dots \}$ 为一族集合， $T$ 为另一集，此时所谓 $T$ 为 ${\mathcal {F}}$ 的碎裂集（英语：shattered set），意思是 $T$ 的每个子集（包括空集和 $T$ 本身），皆可表示成 $T$ 与该族某集之交 $T\cap S_{i}$ 。 ${\mathcal {F}}$ 的VC维是其打碎的最大集的大小。

利用以上术语，绍尔-谢拉赫引理可以写成：

若 ${\mathcal {F}}$ 是一族集合，且各集合中，合共只有 $n$ 个不同元素，但 $\textstyle |{\mathcal {F}}|>\sum _{i=0}^{k-1}{\binom {n}{i}}$ ，则 ${\mathcal {F}}$ 打碎某个 $k$ 元集合。

所以，若 ${\mathcal {F}}$ 的VC维为 $k$ （故不打碎任何 $k+1$ 元集），则 ${\mathcal {F}}$ 中至多只有 $\textstyle \sum _{i=0}^{k}{\binom {n}{i}}=O(n^{k})$ 个集合。

引理所给的界已是最优：考虑 $n$ 元集 $\{1,2,\dots ,n\}$ 中，所有小于 $k$ 个元素的子集，所成的族 ${\mathcal {F}}$ 。该族的大小恰为 $\textstyle \sum _{i=0}^{k-1}{\binom {n}{i}}$ ，但是不打碎任何 $k$ 元集。^[10]

打碎多少集合

帕约尔将绍尔-谢拉赫引理加强为：^[12]

有限族 ${\mathcal {F}}$ 必打碎至少 $|{\mathcal {F}}|$ 个集合。

绍尔-谢拉赫引理为其直接推论，因为 $n$ 元全集的子集中，只有 $\textstyle \sum _{i=0}^{k-1}{\binom {n}{i}}$ 个的元素个数小于 $k$ 。故 $\textstyle |{\mathcal {F}}|>\sum _{i=0}^{k-1}{\binom {n}{i}}$ 时，即使全部小集皆已打碎，仍不足 $|{\mathcal {F}}|$ 个，故必有打碎某个不少于 $k$ 元的集合。

亦可将“碎裂”的概念加以限缩，变成“顺序碎裂”（order-shattered）。此时，任意集族顺序打碎的集合数，必恰好等于该族的大小。^[11]

证

绍尔-谢拉赫引理的帕约尔变式可使用数学归纳法证明，此证法一说出自诺加·阿隆^[13]，一说出自朗·阿哈罗尼（英语：Ron Aharoni）及朗·霍尔兹曼（Ron Holzman）^[11]。

作为归纳法的起始步骤，单一个集合组成的族 ${\mathcal {F}}$ ，能打碎空集，已足 $|{\mathcal {F}}|$ 个。

至于递推步骤，可假设引理对任意少于 $|{\mathcal {F}}|$ 个集合组成的族皆成立，且族 ${\mathcal {F}}$ 有至少两个集合，故可选取元素 $x$ 为其一的元素，但不属于另一集。如此便可将 ${\mathcal {F}}$ 的集合分成两类：包含 $x$ 的集合归入子族 ${\mathcal {F}}_{1}$ ，不含 $x$ 的则归入 ${\mathcal {F}}_{0}$ 。

由归纳假设，两子族各自打碎的集合数，其和至少为 $|{\mathcal {F}}_{0}|+|{\mathcal {F}}_{1}|=|{\mathcal {F}}|$ 。

该些碎裂集不能有元素 $x$ ，因为若要打碎某个有 $x$ 的集合 $A$ ，则族中需要有含 $x$ 的集合，也需要有不含 $x$ 的集合，才能使该族各集与 $A$ 的交集出齐 $A$ 的全体子集。

若集 $S$ 只被一个子族打碎，则数算两子族打碎的集合时， $S$ 计为一个，数 ${\mathcal {F}}$ 打碎的集合时亦计为一个。但 $S$ 也可能同时被两子族打碎，如此，数算两子族打碎的集合时，会将 $S$ 计两次；不过 ${\mathcal {F}}$ 既打碎 $S$ ，也打碎 $S\cup \{x\}$ ，所以 $S$ （和 $S\cup \{x\}$ ）在数 ${\mathcal {F}}$ 打碎的集合时，也计为两次。由此， ${\mathcal {F}}$ 打碎的集合数，不小于两子族各自打碎集合数之和，从而不小于 $|{\mathcal {F}}|$ 。

绍尔-谢拉赫引理的原始形式还有另一种证明，利用线性代数及容斥原理，该证法由弗龙克尔·彼得（英语：Péter Frankl）和保奇·亚诺什（英语：János Pach）给出。^[8]^[10]

应用

VC二氏原先将引理应用于证明，若考虑一族VC维固定的事件，则只需有限多个取样点，即可近似任意的概率分布（使取样所得的事件概率近似该分布下的事件概率），且取样点的个数只取决于VC维数。具体而言，有两种意义下的近似，各由参数 $\varepsilon$ 定义：

取样集 $S$ 上的概率分布定义为原分布的“ $\varepsilon$ 近似”（英语： $\varepsilon$ -approximation），意思是每个事件被 $S$ 以该分布取样的概率，与原概率所差不过 $\varepsilon$ 。
取样集 $S$ （不设权重）定义为“ε网（英语：ε-net (computational geometry)）”，意思是概率不小于 $\varepsilon$ 的任一事件中，必含 $S$ 中至少一点。

由定义， $\varepsilon$ 近似必为 $\varepsilon$ 网，反之则不一定。

VC二氏以此引理证明，若某集族的VC维为 $d$ ，则必有大小不过 $O({\tfrac {d}{\varepsilon ^{2}}}\log {\tfrac {d}{\varepsilon }})$ 的 $\varepsilon$ 近似。其后，Haussler & Welzl (1987)^[14]和Komlós, Pach & Woeginger (1992)^[15]等发表了类似的结果，证明必存在大小为 $O({\tfrac {d}{\varepsilon }}\log {\tfrac {1}{\varepsilon }})$ 的 $\varepsilon$ 网，具体上界为 ${\tfrac {d}{\varepsilon }}\ln {\tfrac {1}{\varepsilon }}+{\tfrac {2d}{\varepsilon }}\ln \ln {\tfrac {1}{\varepsilon }}+{\tfrac {6d}{\varepsilon }}$ 。^[8]证明存在小 $\varepsilon$ 网的主要方法是，先拣选 $O({\tfrac {d}{\varepsilon }}\log {\tfrac {1}{\varepsilon }})$ 个点的随机样本 $x$ ，再独立拣选另 $O({\tfrac {d}{\varepsilon }}\log ^{2}{\tfrac {1}{\varepsilon }})$ 个点的随机样本 $y$ ，然后证明以下的不等式：存在某个较大事件 $E$ 与 $x$ 不交的概率 $p_{1}$ ，与存在同样的事件，且该事件与 $y$ 相交点数大于中位值的概率 $p_{2}$ ，两概率之比至多为 $2$ 。若固定 $E$ 和 $x\cup y$ ，则 $x$ 不交 $E$ 而 $y$ 与 $E$ 相交点数多于中位值的概率很小。由绍尔-谢拉赫引理， $E$ 与 $x\cup y$ 的交集的可能情况并不太多，计算“存在 $E$ 满足条件”的概率时，只需对每种交集的可能性考虑一个 $E$ ，因此由并集上界，可得 $p_{1}\leq 2p_{2}<1$ ，故 $x$ 有非零概率为 $\varepsilon$ 网。^[8]

以上论证说明随机取样足够多个点就可能是 $\varepsilon$ 网。此性质以及 $\varepsilon$ 网、 $\varepsilon$ 近似两概念本身，皆在机器学习和概率近似正确学习（英语：probably approximately correct learning）方面有应用。^[16]计算几何方面的应用则有范围搜索（英语：range searching）^[14]、去随机化（英语：derandomization）^[17]、近似算法^[18]^[19]。

Kozma & Moran (2013)利用绍尔-谢拉赫引理的推广，证明若干图论结果，例如：图的强定向（英语：strong orientation）^{[注 2]}方案数介于其连通子图数与边双连通（英语：k-edge-connected graph）子图数之间。^[9]

注

^ 多个来源断言此处（以及与下文VC二氏的发现）的独立性^[3]，如^[4]^[5]。
^ 为每条边选定一方向，使全幅图强连通。

参考文献

^ Sauer, N. On the density of families of sets. Journal of Combinatorial Theory（英语：Journal of Combinatorial Theory）. Series A. 1972, 13: 145–147. MR 0307902. doi:10.1016/0097-3165(72)90019-2  （英语）.
^ Shelah, Saharon. A combinatorial problem; stability and order for models and theories in infinitary languages. Pacific Journal of Mathematics. 1972, 41: 247–261. MR 0307903. doi:10.2140/pjm.1972.41.247 . （原始内容存档于2013-10-05）（英语）.
^ Bottou, Leon. On the Vapnik-Chevonenkis-Sauer lemma. [2022-03-07]. （原始内容存档于2022-03-19）（英语）.
^ Bollobás, Béla; Radcliff, A.J. Defect Sauer results. Journal of Combinatorial Theory, Series A. 1995, 72: 189–208. doi:10.1016/0097-3165(95)90060-8 （英语）.
^ Smola, Alexander J.; Mendelson, Shahar. Advanced Lectures on Machine Learning. Springer. 2003: 12. ISBN 9783540364344 （英语）.
^ Vapnik, V. N.; Červonenkis, A. Ja. The uniform convergence of frequencies of the appearance of events to their probabilities. Akademija Nauk SSSR. 1971, 16: 264–279. MR 0288823 （英语）.
^ ^7.0 ^7.1 Buzaglo, Sarit; Pinchasi, Rom; Rote, Günter. Topological hypergraphs. Pach, János (编). Thirty Essays on Geometric Graph Theory. Springer. 2013: 71–81. doi:10.1007/978-1-4614-0110-0_6  （英语）.
^ ^8.0 ^8.1 ^8.2 ^8.3 Pach, János; Agarwal, Pankaj K. Combinatorial geometry. Wiley-Interscience Series in Discrete Mathematics and Optimization. New York: John Wiley & Sons Inc.: 247–251. 1995. ISBN 0-471-58890-3. MR 1354145. doi:10.1002/9781118033203  （英语）.
^ ^9.0 ^9.1 Kozma, László; Moran, Shay. Shattering, Graph Orientations, and Connectivity. Electronic Journal of Combinatorics（英语：Electronic Journal of Combinatorics）. 2013, 20 (3). P44 [2022-03-06]. Bibcode:2012arXiv1211.1319K. MR 3118952. arXiv:1211.1319 . （原始内容存档于2022-05-25）（英语）.
^ ^10.0 ^10.1 Gowers, Timothy. Dimension arguments in combinatorics. Gowers's Weblog: Mathematics related discussions. Example 3. 2008-07-31 [2022-03-12]. （原始内容存档于2022-07-09）（英语）.
^ ^11.0 ^11.1 ^11.2 Anstee, R. P.; Rónyai, Lajos; Sali, Attila. Shattering news. Graphs and Combinatorics（英语：Graphs and Combinatorics）. 2002, 18 (1): 59–73. MR 1892434. doi:10.1007/s003730200003  （英语）.
^ Pajor, Alain. Sous-espaces $l_{1}^{n}$ des espaces de Banach. Travaux en Cours 16. Paris: Hermann. 1985. ISBN 2-7056-6021-6. MR 0903247 （法语）. 由^[11]引述。
^ Kalai, Gil. Extremal Combinatorics III: Some Basic Theorems. Combinatorics and More. 2008-09-28 [2022-03-13]. （原始内容存档于2022-03-13）（英语）.
^ ^14.0 ^14.1 Haussler, David; Welzl, Emo. ε-nets and simplex range queries. Discrete and Computational Geometry（英语：Discrete and Computational Geometry）. 1987, 2 (2): 127–151. MR 0884223. doi:10.1007/BF02187876 .
^ Komlós, János; Pach, János; Woeginger, Gerhard. Almost tight bounds for ε-nets. Discrete and Computational Geometry（英语：Discrete and Computational Geometry）. 1992, 7 (2): 163–173. MR 1139078. doi:10.1007/BF02187833 . .
^ Blumer, Anselm; Ehrenfeucht, Andrzej; Haussler, David; Warmuth, Manfred K. Learnability and the Vapnik–Chervonenkis dimension. Journal of the ACM. 1989, 36 (4): 929–965. MR 1072253. doi:10.1145/76359.76371.
^ Chazelle, B.; Friedman, J. A deterministic view of random sampling and its use in geometry. Combinatorica（英语：Combinatorica）. 1990, 10 (3): 229–249. MR 1092541. doi:10.1007/BF02122778 .
^ Brönnimann, H.; Goodrich, M. T. Almost optimal set covers in finite VC-dimension. Discrete and Computational Geometry（英语：Discrete and Computational Geometry）. 1995, 14 (4): 463–479. MR 1360948. doi:10.1007/BF02570718 .
^ Har-Peled, Sariel. On complexity, sampling, and ε-nets and ε-samples. Geometric approximation algorithms. Mathematical Surveys and Monographs 173. Providence, RI: American Mathematical Society. 2011: 61–85. ISBN 978-0-8218-4911-8. MR 2760023.

[6] 多个来源断言此处（以及与下文VC二氏的发现）的独立性^[3]，如^[4]^[5]。

[21] 为每条边选定一方向，使全幅图强连通。

[1] Sauer, N. On the density of families of sets. Journal of Combinatorial Theory（英语：Journal of Combinatorial Theory）. Series A. 1972, 13: 145–147. MR 0307902. doi:10.1016/0097-3165(72)90019-2  （英语）.

[2] Shelah, Saharon. A combinatorial problem; stability and order for models and theories in infinitary languages. Pacific Journal of Mathematics. 1972, 41: 247–261. MR 0307903. doi:10.2140/pjm.1972.41.247 . （原始内容存档于2013-10-05）（英语）.

[3] Bottou, Leon. On the Vapnik-Chevonenkis-Sauer lemma. [2022-03-07]. （原始内容存档于2022-03-19）（英语）.

[4] Bollobás, Béla; Radcliff, A.J. Defect Sauer results. Journal of Combinatorial Theory, Series A. 1995, 72: 189–208. doi:10.1016/0097-3165(95)90060-8 （英语）.

[5] Smola, Alexander J.; Mendelson, Shahar. Advanced Lectures on Machine Learning. Springer. 2003: 12. ISBN 9783540364344 （英语）.

[7] Vapnik, V. N.; Červonenkis, A. Ja. The uniform convergence of frequencies of the appearance of events to their probabilities. Akademija Nauk SSSR. 1971, 16: 264–279. MR 0288823 （英语）.

[brr-8] 7.0 ^7.1 Buzaglo, Sarit; Pinchasi, Rom; Rote, Günter. Topological hypergraphs. Pach, János (编). Thirty Essays on Geometric Graph Theory. Springer. 2013: 71–81. doi:10.1007/978-1-4614-0110-0_6  （英语）.

[pa95-9] 8.0 ^8.1 ^8.2 ^8.3 Pach, János; Agarwal, Pankaj K. Combinatorial geometry. Wiley-Interscience Series in Discrete Mathematics and Optimization. New York: John Wiley & Sons Inc.: 247–251. 1995. ISBN 0-471-58890-3. MR 1354145. doi:10.1002/9781118033203  （英语）.

[km13-10] 9.0 ^9.1 Kozma, László; Moran, Shay. Shattering, Graph Orientations, and Connectivity. Electronic Journal of Combinatorics（英语：Electronic Journal of Combinatorics）. 2013, 20 (3). P44 [2022-03-06]. Bibcode:2012arXiv1211.1319K. MR 3118952. arXiv:1211.1319 . （原始内容存档于2022-05-25）（英语）.

[gowers-11] 10.0 ^10.1 Gowers, Timothy. Dimension arguments in combinatorics. Gowers's Weblog: Mathematics related discussions. Example 3. 2008-07-31 [2022-03-12]. （原始内容存档于2022-07-09）（英语）.

[ars02-12] 11.0 ^11.1 ^11.2 Anstee, R. P.; Rónyai, Lajos; Sali, Attila. Shattering news. Graphs and Combinatorics（英语：Graphs and Combinatorics）. 2002, 18 (1): 59–73. MR 1892434. doi:10.1007/s003730200003  （英语）.

[13] Pajor, Alain. Sous-espaces $l_{1}^{n}$ des espaces de Banach. Travaux en Cours 16. Paris: Hermann. 1985. ISBN 2-7056-6021-6. MR 0903247 （法语）. 由^[11]引述。

[14] Kalai, Gil. Extremal Combinatorics III: Some Basic Theorems. Combinatorics and More. 2008-09-28 [2022-03-13]. （原始内容存档于2022-03-13）（英语）.

[hw87-15] 14.0 ^14.1 Haussler, David; Welzl, Emo. ε-nets and simplex range queries. Discrete and Computational Geometry（英语：Discrete and Computational Geometry）. 1987, 2 (2): 127–151. MR 0884223. doi:10.1007/BF02187876 .

[16] Komlós, János; Pach, János; Woeginger, Gerhard. Almost tight bounds for ε-nets. Discrete and Computational Geometry（英语：Discrete and Computational Geometry）. 1992, 7 (2): 163–173. MR 1139078. doi:10.1007/BF02187833 . .

[17] Blumer, Anselm; Ehrenfeucht, Andrzej; Haussler, David; Warmuth, Manfred K. Learnability and the Vapnik–Chervonenkis dimension. Journal of the ACM. 1989, 36 (4): 929–965. MR 1072253. doi:10.1145/76359.76371.

[18] Chazelle, B.; Friedman, J. A deterministic view of random sampling and its use in geometry. Combinatorica（英语：Combinatorica）. 1990, 10 (3): 229–249. MR 1092541. doi:10.1007/BF02122778 .

[19] Brönnimann, H.; Goodrich, M. T. Almost optimal set covers in finite VC-dimension. Discrete and Computational Geometry（英语：Discrete and Computational Geometry）. 1995, 14 (4): 463–479. MR 1360948. doi:10.1007/BF02570718 .

[20] Har-Peled, Sariel. On complexity, sampling, and ε-nets and ε-samples. Geometric approximation algorithms. Mathematical Surveys and Monographs 173. Providence, RI: American Mathematical Society. 2011: 61–85. ISBN 978-0-8218-4911-8. MR 2760023.

[1]

[2]

[注 1]

[6]

[7]

[8]

[9]

[10]

[12]

[11]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[注 2]

[3]

[4]

[5]