Programación lógica inductiva

La Programación lógica inductiva (ILP, por sus siglas en inglés) es un subcampo de lainteligencia artificial simbólica que usa programación lógica como representación uniforme para ejemplos, hipótesis y conocimiento previo. Dada una codificación del conocimiento previo sabida y un conjunto de ejemplos representados como base de datos lógica de hechos, un sistema de ILP derivará una lógica hipotetizada cuya consecuencia lógica implique todos los ejemplos positivos y ninguno de los negativos.

Esquema: Ejemplos positivos + ejemplos negativos + conocimiento previo ⇒ hipòtesis.

La programación lógica inductiva es particularmente útil en bioinformática y procesamiento del lenguaje natural . Gordon Plotkin y Ehud Shapiro sentaron las bases teóricas iniciales para el aprendizaje automático inductivo en un entorno lógico.^[1]^[2]^[3] Shapiro construyó su primera implementación (Modelo de Sistema de Inferencia) en 1981:^[4] un programa Prolog que dedujo inductivamente programas lógicos a partir de ejemplos positivos y negativos. El término Programación lógica inductiva se introdujo por primera vez^[5] en un artículo de Stephen Muggleton en 1991.^[6] Muggleton también fundó la conferencia internacional anual sobre programación de lógica inductiva, introdujo las ideas teóricas de la Invención de Predicados, la resolución inversa,^[7] y la implicación inversa.^[8] Muggleton implementó la vinculación inversa primero en el sistema PROGOL . El término " inductivo " aquí se refiere a la inducción filosófica (es decir, que sugiere una teoría para explicar los hechos observados) en lugar de matemática (es decir, que demuestra una propiedad para todos los miembros de un conjunto bien ordenado).

Definición formal

Los conocimientos previos se dan como una teoría lógica $B$ , comúnmente en forma de cláusulas Horn utilizadas en la programación lógica . Los ejemplos positivos y negativos se dan como una conjunción $E^{+}$ y $E^{-}$ de ground literals sin negar y negados, respectivamente. Una hipótesis correcta $h$ es una proposición lógica que satisface los siguientes requisitos.^[9]

{\begin{array}{llll}{\text{Necesidad:}}&B&\not \models &E^{+}\\{\text{Suficiencia:}}&B\land h&\color {blue}{\models }&E^{+}\\{\text{Consistencia débil:}}&B\land h&\not \models &{\textit {false}}\\{\text{Consistencia fuerte:}}&B\land h\land E^{-}&\not \models &{\textit {false}}\end{array}}

La " necesidad " no impone una restricción a $h$ , pero prohíbe cualquier generación de hipótesis mientras los hechos positivos sean explicables sin ella.

La "suficiencia " es lo que requiere cualquier hipótesis generada $h$ para explicar todos los ejemplos positivos $E^{+}$ .

La " consistencia débil " prohíbe la generación de cualquier hipótesis $h$ que contradiga el conocimiento básico $B$ .

La "consistencia fuerte " también prohíbe la generación de cualquier hipótesis $h$ que sea inconsistente con los ejemplos negativos $E^{-}$ , dado el conocimiento de fondo $B$ ; esto implica "Consistencia débil "; Si no se dan ejemplos negativos, ambos requisitos coinciden. Džeroski^[10] requiere solo "Suficiencia " (que llama "Completitud") y "Consistencia fuerte ".

Ejemplo

Presuntas relaciones familiares en la sección "Ejemplo"

El siguiente ejemplo bien conocido sobre el aprendizaje de definiciones de relaciones familiares utiliza las siguientes abreviaturas:

par : padre

,

fem : mujer

,

dau : hija

,

g : George

,

h : Helen

,

m : Mary

,

t : Tom

,

n : Nancy

, y

e : Eve

.

Comienza con el sgte. conocimiento previo (cf. imagen):

{\textit {par}}(h,m)\land {\textit {par}}(h,t)\land {\textit {par}}(g,m)\land {\textit {par}}(t,e)\land {\textit {par}}(n,e)\land {\textit {fem}}(h)\land {\textit {fem}}(m)\land {\textit {fem}}(n)\land {\textit {fem}}(e)

,

los ejemplos positivos

{\textit {dau}}(m,h)\land {\textit {dau}}(e,t)

,

y la proposición trivial $true$ para denotar la ausencia de ejemplos negativos.

El enfoque de Plotkin^[11]^[12] de "generalización general menos relativa (rlgg) " a la programación lógica inductiva se utilizará para obtener una sugerencia sobre cómo definir formalmente la relación hija $dau$ .

Este enfoque utiliza los siguientes pasos.

Relativiza cada ejemplo positivo literal con el conocimiento previo (o de fondo) completo:
${\begin{aligned}{\textit {dau}}(m,h)\leftarrow {\textit {par}}(h,m)\land {\textit {par}}(h,t)\land {\textit {par}}(g,m)\land {\textit {par}}(t,e)\land {\textit {par}}(n,e)\land {\textit {fem}}(h)\land {\textit {fem}}(m)\land {\textit {fem}}(n)\land {\textit {fem}}(e)\\{\textit {dau}}(e,t)\leftarrow {\textit {par}}(h,m)\land {\textit {par}}(h,t)\land {\textit {par}}(g,m)\land {\textit {par}}(t,e)\land {\textit {par}}(n,e)\land {\textit {fem}}(h)\land {\textit {fem}}(m)\land {\textit {fem}}(n)\land {\textit {fem}}(e)\end{aligned}}$ ,
Lo convierte en cláusula forma normal :
${\begin{aligned}{\textit {dau}}(m,h)\lor \lnot {\textit {par}}(h,m)\lor \lnot {\textit {par}}(h,t)\lor \lnot {\textit {par}}(g,m)\lor \lnot {\textit {par}}(t,e)\lor \lnot {\textit {par}}(n,e)\lor \lnot {\textit {fem}}(h)\lor \lnot {\textit {fem}}(m)\lor \lnot {\textit {fem}}(n)\lor \lnot {\textit {fem}}(e)\\{\textit {dau}}(e,t)\lor \lnot {\textit {par}}(h,m)\lor \lnot {\textit {par}}(h,t)\lor \lnot {\textit {par}}(g,m)\lor \lnot {\textit {par}}(t,e)\lor \lnot {\textit {par}}(n,e)\lor \lnot {\textit {fem}}(h)\lor \lnot {\textit {fem}}(m)\lor \lnot {\textit {fem}}(n)\lor \lnot {\textit {fem}}(e)\end{aligned}}$ ,
Anti-unifica cada par compatible^[13]^[14] de literales:
- ${\textit {dau}}(x_{me},x_{ht})$ de ${\textit {dau}}(m,h)$ y ${\textit {dau}}(e,t)$ ,
- $\lnot {\textit {par}}(x_{ht},x_{me})$ de $\lnot {\textit {par}}(h,m)$ y $\lnot {\textit {par}}(t,e)$ ,
- $\lnot {\textit {fem}}(x_{me})$ de $\lnot {\textit {fem}}(m)$ y $\lnot {\textit {fem}}(e)$ ,
- $\lnot {\textit {par}}(g,m)$ de $\lnot {\textit {par}}(g,m)$ y $\lnot {\textit {par}}(g,m)$ , similar para todos los demás literales de conocimiento de fondo
- $\lnot {\textit {par}}(x_{gt},x_{me})$ de $\lnot {\textit {par}}(g,m)$ y $\lnot {\textit {par}}(t,e)$ y muchos más literales negados
Elimina todos los literales negados que contienen variables que no aparecen en un literal positivo:
- después de eliminar todos los literales negados que contienen otras variables que $x_{me},x_{ht}$ , solamente ${\textit {dau}}(x_{me},x_{ht})\lor \lnot {\textit {par}}(x_{ht},x_{me})\lor \lnot {\textit {fem}}(x_{me})$ permanece, junto con todos los literales básicos del conocimiento de fondo
Convierta las cláusulas de nuevo a la forma Horn:
- ${\textit {dau}}(x_{me},x_{ht})\leftarrow {\textit {par}}(x_{ht},x_{me})\land {\textit {fem}}(x_{me})\land ({\text{todos los hechos del conocimiento previo}})$

La cláusula Horn resultante es la hipótesis $h$ obtenida por el enfoque rlgg. Ignorando los hechos del conocimiento de fondo, la cláusula lee informalmente " $x_{me}$ se llama hija de $x_{ht}$ Si $x_{ht}$ es el padre de $x_{me}$ y $x_{me}$ es femenino ", que es una definición comúnmente aceptada.

Con respecto a los requisitos anteriores, La "Necesidad " se cumplió porque el predicado $dau$ no aparece en el conocimiento de fondo. La "suficiencia " se satisface con la hipótesis calculada $h$ , ya que, junto con ${\textit {par}}(h,m)\land {\textit {fem}}(m)$ desde el conocimiento previo, implica el primer ejemplo positivo ${\textit {dau}}(m,h)$ y de manera similar $h$ y ${\textit {par}}(t,e)\land {\textit {fem}}(e)$ desde el conocimiento previo implica el segundo ejemplo positivo ${\textit {dau}}(e,t)$ . La "Consistencia débil " es satisfecha por $h$ , ya que $h$ se mantiene en la estructura Herbrand (finita) descrita por el conocimiento de fondo; lo mismo pasa con la "Consistencia fuerte".

La definición común de la relación de la abuela, a saber. ${\textit {gra}}(x,z)\leftarrow {\textit {fem}}(x)\land {\textit {par}}(x,y)\land {\textit {par}}(y,z)$ , no se puede aprender utilizando el enfoque anterior, ya que la variable $y$ ocurre solo en el cuerpo de la cláusula; los literales correspondientes se habrían eliminado en el cuarto paso del enfoque. Para superar este defecto, ese paso tiene que modificarse de modo que pueda parametrizarse con diferentes heurísticas de post-selección literales . Históricamente, la implementación de GOLEM se basa en el enfoque rlgg.

Sistemas de programación de lógica inductiva

Un sistema de programación lógica inductiva es un programa que toma como entrada teorías lógicas de entrada $B,E^{+},E^{-}$ y genera una hipótesis correcta $H$ . Un algoritmo de un sistema ILP consta de dos partes: búsqueda de hipótesis y selección de hipótesis. Primero se busca una hipótesis con un procedimiento de programación de lógica inductiva, luego se elige un subconjunto de las hipótesis encontradas (en la mayoría de los sistemas, una hipótesis) mediante un algoritmo de selección. Un algoritmo de selección puntúa cada una de las hipótesis encontradas y devuelve las que tienen la puntuación más alta. Un ejemplo de función de puntaje puede ser una que incluya longitud de compresión mínima donde una hipótesis con la menor complejidad de Kolmogorov tiene el puntaje más alto y por lo tanto se la elige. Un sistema ILP está completo si y solo si para cualquier teoría lógica de entrada $B,E^{+},E^{-}$ cualquier hipótesis correcta $H$ se puede encontrar con su procedimiento de búsqueda de hipótesis.

Búsqueda de hipótesis

Los sistemas ILP modernos como Progol,^[6] Hail^[15] e Imparo^[16] encuentran una hipótesis $H$ utilizando el principio de la vinculación inversa para las teorías $B$ , $E$ , $H$ : $B\land H\models E\iff B\land \neg E\models \neg H$ . Primero construyen una teoría intermedia $F$ llamada teoría de puente que satisface las condiciones $B\land \neg E\models F$ y $F\models \neg H$ . Entonces como $H\models \neg F$ , generalizan la negación de la teoría del puente $F$ con la anti-vinculación.^[17] Sin embargo, la operación del anti-vinculación, dado que es altamente no determinista es computacionalmente más costosa. Por lo tanto, se puede realizar una búsqueda alternativa de hipótesis utilizando la operación de la subsunción inversa (anti-subsunción), que es menos no determinista que la anti-vinculación.

Surgen preguntas sobre la integridad de un procedimiento de búsqueda de hipótesis de un sistema ILP específico. Por ejemplo, el procedimiento de búsqueda de hipótesis de Progol basado en la regla de inferencia de vinculación inversa no se completa con el ejemplo de Yamamoto .^[18] Por otro lado, Imparo se completa por el procedimiento anti-vinculación^[19] y su procedimiento de subsunción inversa extendida.^[20]

Implementaciones

1BC y 1BC2: clasificadores bayesianos "ingenuos" de primer orden:
ACE (un motor combinado)
Aleph
Atom Archivado el 26 de marzo de 2014 en Wayback Machine.
Claudien (enlace roto disponible en Internet Archive; véase el historial, la primera versión y la última).
DL-Learner
DMax
FastLAS (Aprendizaje rápido desde conjuntos de respuestas)
[1] (enlace roto disponible en Internet Archive; véase el historial, la primera versión y la última). FOIL
[2] Golem (ILP)
ILASP (Aprendizaje inductivo de programas desde conjunto de respuestas)
Imparo^[19]
Inthelex (INcremental THEory Learning from EXamples) Archivado el 28 de noviembre de 2011 en Wayback Machine.
Lima
Metagol
Mio
MIS (Sistema de inferencia modelo) de Ehud Shapiro
[3] PROGOL
RSD Archivado el 1 de marzo de 2007 en Wayback Machine.
Warmr (ahora incluido en ACE)
ProGolem^[21]^[22]

Véase también

Referencias

↑ Plotkin G.D. Automatic Methods of Inductive Inference, PhD thesis, University of Edinburgh, 1970.
↑ Shapiro, Ehud Y. Inductive inference of theories from facts, Research Report 192, Yale University, Department of Computer Science, 1981. Reprinted in J.-L. Lassez, G. Plotkin (Eds.), Computational Logic, The MIT Press, Cambridge, MA, 1991, pp. 199–254.
↑ Shapiro, Ehud Y. (1983). Algorithmic program debugging. Cambridge, Mass: MIT Press. ISBN 0-262-19218-7
↑ Shapiro, Ehud Y. "The model inference system." Proceedings of the 7th international joint conference on Artificial intelligence-Volume 2. Morgan Kaufmann Publishers Inc., 1981.
↑ Luc De Raedt. A Perspective on Inductive Logic Programming. The Workshop on Current and Future Trends in Logic Programming, Shakertown, to appear in Springer LNCS, 1999. CiteSeerX: 10.1.1.56.1790
↑ ^a ^b Muggleton, S.H. (1991). «Inductive logic programming». New Generation Computing 8 (4): 295-318. doi:10.1007/BF03037089.
↑ Muggleton S.H. and Buntine W. "Machine invention of first-order predicate by inverting resolution","Proceedings of the 5th International Conference on Machine Learning, 1988.
↑ Muggleton, S.H. (1995). «Inverting entailment and Progol». New Generation Computing 13 (3–4): 245-286. doi:10.1007/bf03037227.
↑ Muggleton, Stephen (1999). «Inductive Logic Programming: Issues, Results and the Challenge of Learning Language in Logic». Artificial Intelligence 114 (1–2): 283-296. doi:10.1016/s0004-3702(99)00067-3. ; here: Sect.2.1
↑ Džeroski, Sašo (1996), «Inductive Logic Programming and Knowledge Discovery in Databases», en Fayyad, U.M.; Piatetsky-Shapiro, G.; Smith et al., eds., Advances in Knowledge Discovery and Data Mining, MIT Press, pp. 117-152 .; here: Sect.5.2.4
↑ Plotkin, Gordon D. (1970). «A Note on Inductive Generalization». En Meltzer, B., ed. Machine Intelligence 5: 153-163.
↑ Plotkin, Gordon D. (1971). «A Further Note on Inductive Generalization». En Meltzer, B., ed. Machine Intelligence 6: 101-124.
↑ i.e. sharing the same predicate symbol and negated/unnegated status
↑ in general: $n$ -tuple when $n$ positive example literals are given
↑ Ray, O., Broda, K., & Russo, A. M. (2003). Hybrid abductive inductive learning Archivado el 8 de abril de 2022 en Wayback Machine.. In LNCS: Vol. 2835. Proceedings of the 13th international conference on inductive logic programming (pp. 311–328). Berlin: Springer.
↑ Kimber, T., Broda, K., & Russo, A. (2009). Induction on failure: learning connected Horn theories. In LNCS: Vol. 5753. Proceedings of the 10th international conference on logic programing and nonmonotonic reasoning (pp. 169–181). Berlin: Springer.
↑ Yoshitaka Yamamoto, Katsumi Inoue, and Koji Iwanuma. Inverse subsumption for complete explanatory induction. Machine learning, 86(1):115–139, 2012.
↑ Akihiro Yamamoto. Which hypotheses can be found with inverse entailment? In Inductive Logic Programming, pages 296–308. Springer, 1997.
↑ ^a ^b Timothy Kimber. Learning definite and normal logic programs by induction on failure. PhD thesis, Imperial College London, 2012.
↑ David Toth (2014). Imparo is complete by inverse subsumption. arXiv:1407.3836
↑ Muggleton, Stephen; Santos, Jose; Tamaddoni-Nezhad, Alireza (2009). «ProGolem: a system based on relative minimal generalization». ILP.
↑ Santos, Jose; Nassif, Houssam; Page, David; Muggleton, Stephen; Sternberg, Mike (2012). «Automated identification of features of protein-ligand interactions using Inductive Logic Programming: a hexose binding case study». BMC Bioinformatics 13: 162. PMC 3458898. PMID 22783946. doi:10.1186/1471-2105-13-162. Archivado desde el original el 3 de marzo de 2016. Consultado el 10 de julio de 2020.

Otras lecturas

Muggleton, S.; De Raedt, L. (1994). «Inductive Logic Programming: Theory and methods». The Journal of Logic Programming. 19–20: 629-679. doi:10.1016/0743-1066(94)90035-3.
Lavrac, N.; Dzeroski, S. (1994). Inductive Logic Programming: Techniques and Applications. New York: Ellis Horwood. ISBN 978-0-13-457870-5. Archivado desde el original el 6 de septiembre de 2004. Consultado el 22 de septiembre de 2004.
Visual example of inducing the grandparenthood relation by the Atom system. http://john-ahlgren.blogspot.com/2014/03/inductive-reasoning-visualized.html Archivado el 26 de marzo de 2014 en Wayback Machine.

Datos: Q1464197

[1] Plotkin G.D. Automatic Methods of Inductive Inference, PhD thesis, University of Edinburgh, 1970.

[2] Shapiro, Ehud Y. Inductive inference of theories from facts, Research Report 192, Yale University, Department of Computer Science, 1981. Reprinted in J.-L. Lassez, G. Plotkin (Eds.), Computational Logic, The MIT Press, Cambridge, MA, 1991, pp. 199–254.

[3] Shapiro, Ehud Y. (1983). Algorithmic program debugging. Cambridge, Mass: MIT Press. ISBN 0-262-19218-7

[4] Shapiro, Ehud Y. "The model inference system." Proceedings of the 7th international joint conference on Artificial intelligence-Volume 2. Morgan Kaufmann Publishers Inc., 1981.

[5] Luc De Raedt. A Perspective on Inductive Logic Programming. The Workshop on Current and Future Trends in Logic Programming, Shakertown, to appear in Springer LNCS, 1999. CiteSeerX: 10.1.1.56.1790

[muggleton1995inverse-6] Muggleton, S.H. (1991). «Inductive logic programming». New Generation Computing 8 (4): 295-318. doi:10.1007/BF03037089.

[7] Muggleton S.H. and Buntine W. "Machine invention of first-order predicate by inverting resolution","Proceedings of the 5th International Conference on Machine Learning, 1988.

[8] Muggleton, S.H. (1995). «Inverting entailment and Progol». New Generation Computing 13 (3–4): 245-286. doi:10.1007/bf03037227.

[9] Muggleton, Stephen (1999). «Inductive Logic Programming: Issues, Results and the Challenge of Learning Language in Logic». Artificial Intelligence 114 (1–2): 283-296. doi:10.1016/s0004-3702(99)00067-3. ; here: Sect.2.1

[10] Džeroski, Sašo (1996), «Inductive Logic Programming and Knowledge Discovery in Databases», en Fayyad, U.M.; Piatetsky-Shapiro, G.; Smith et al., eds., Advances in Knowledge Discovery and Data Mining, MIT Press, pp. 117-152 .; here: Sect.5.2.4

[11] Plotkin, Gordon D. (1970). «A Note on Inductive Generalization». En Meltzer, B., ed. Machine Intelligence 5: 153-163.

[12] Plotkin, Gordon D. (1971). «A Further Note on Inductive Generalization». En Meltzer, B., ed. Machine Intelligence 6: 101-124.

[13] .e. sharing the same predicate symbol and negated/unnegated status

[14] ral: $n$ -tuple when $n$ positive example literals are given

[15] Ray, O., Broda, K., & Russo, A. M. (2003). Hybrid abductive inductive learning Archivado el 8 de abril de 2022 en Wayback Machine.. In LNCS: Vol. 2835. Proceedings of the 13th international conference on inductive logic programming (pp. 311–328). Berlin: Springer.

[16] Kimber, T., Broda, K., & Russo, A. (2009). Induction on failure: learning connected Horn theories. In LNCS: Vol. 5753. Proceedings of the 10th international conference on logic programing and nonmonotonic reasoning (pp. 169–181). Berlin: Springer.

[17] Yoshitaka Yamamoto, Katsumi Inoue, and Koji Iwanuma. Inverse subsumption for complete explanatory induction. Machine learning, 86(1):115–139, 2012.

[18] Akihiro Yamamoto. Which hypotheses can be found with inverse entailment? In Inductive Logic Programming, pages 296–308. Springer, 1997.

[kimber2009induction-19] Timothy Kimber. Learning definite and normal logic programs by induction on failure. PhD thesis, Imperial College London, 2012.

[20] David Toth (2014). Imparo is complete by inverse subsumption. arXiv:1407.3836

[21] Muggleton, Stephen; Santos, Jose; Tamaddoni-Nezhad, Alireza (2009). «ProGolem: a system based on relative minimal generalization». ILP.

[22] Santos, Jose; Nassif, Houssam; Page, David; Muggleton, Stephen; Sternberg, Mike (2012). «Automated identification of features of protein-ligand interactions using Inductive Logic Programming: a hexose binding case study». BMC Bioinformatics 13: 162. PMC 3458898. PMID 22783946. doi:10.1186/1471-2105-13-162. Archivado desde el original el 3 de marzo de 2016. Consultado el 10 de julio de 2020.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]