Progic 2023

Program and Abstracts

All talks take place in Drift 25, Room 002 (access through Drift 27) in the Utrecht city center.

Wednesday, August 30

09:10 – 09:30

The principles of optimum entropy — maximum entropy resp. minimum cross-entropy — are powerful and well-known methods in probabilistics and statistics, yet seem to work as black boxes. Over the years, both principles have been characterized by axioms that reveal crucial properties of these principles. From an abstract knowledge representation perspective, maximum entropy provides solutions for inductive reasoning problems in probabilistics, while minimum cross-entropy solves a complex belief revision problem. Although probabilistic reasoning is an important subarea of knowledge representation, the major part of knowledge representation is concerned with symbolic, logic-based approaches that are perceived to be substantially different from probabilistics.

In this talk, I will explore the relevance of some of the axiomatic characteristics of the entropy principles for knowledge representation in a semantic environment that serves as a mediator between probabilities and symbolic, qualitative methodologies: Spohn’s ranking functions, aka ordinal conditional functions, can be interpreted as orders of magnitude of logarithmic probabilities. They share crucial properties with probabilities, in particular, they allow for conditionalization, but work on a discrete, semi-quantitative scale and, in the end, yield symbolic inferences via qualitative comparisons. Since ranking functions comply with all requirements of axiomatic non-monotonic reasoning and belief revision in knowledge representation, re-interpreting axiomatic properties of the entropy principles in this semantic framework is relevant for both subfields. In particular, I show which role the characteristic properties of conditional preservation, syntax splitting, and kinematics satisfied by the entropy principles can and should play for (symbolic) nonmonotonic reasoning and belief revision. The talk presents results from recent joint works with Christoph Beierle, Gerhard Brewka, and Meliha Sezgin.

10:40 – 10:55

Our paper is concerned with methods of aggregating statistical results. The direct motivation is a phenomenon known as “extremizing”: in some cases it seems rational to bring the aggregated opinion beyond all the individual expert opinions, i.e., q* > qi for all experts i. This phenomenon can be connected to the “risky shift” observed in social psychology, where agents irrationally amplify each others’ opinions. But it also naturally relates to successful forecasting methods, as discussed in Tetlock’s popular science book “Superforecasters”, and to corrections on the biases described in Kahneman’s prospect theory. We present three models deriving from inductive logic in which extremizing can be explained and motivated. They offer insights by which we can connect themes from inductive logic, social learning, and statistical meta-analysis.

In this talk we discuss logics for the syntactical treatment of probabilistic relevance relations. Specifically, we define conservative expansions of Classical Logic endowed with a ternary connective ↝(x,y,z) – indeed, a constrained material implication – whose intuitive reading is “x materially implies y and it is relevant to y under the evidence z”. In turn, this connective ensures the definability of a formula in three-variables R(x, z, y) which is the representative of relevance in the object language. We outline the algebraic semantics of such logics, and we analyze some of its properties. Finally, we apply the acquired machinery to investigate some term-defined weakly connexive implications with some intuitive appeal. As a consequence, a further motivation of (weakly) connexive principles in terms of relevance and background assumptions obtains.

12:15 – 13:45
Lunch break

It has proved notoriously difficult to define harm. Indeed, it has been claimed that the notion of harm is a “Frankensteinian jumble” that should be replaced by other well-behaved notions. On the other hand, harm has become increasingly important as concerns about the potential harms that may be caused by AI systems grow. For example, the European Union’s draft AI act mentions “harm” over 25 times and points out that, given its crucial role, it must be defined carefully.

I start by defining a qualitative notion of harm that uses causal models and is based on a well-known definition of actual causality. The key features of the definition are that it is based on contrastive causation and uses a default utility to which the utility of actual outcomes is compared. I show that our definition is able to handle the problematic examples from the literature. I extend the definition to a quantitative notion of harm, first in the case of a single individual, and then for groups of individuals. I show that the “obvious” way of doing this (just taking the expected harm for an individual and then summing the expected harm over all individuals) can lead to counterintuitive or inappropriate answers, and discuss alternatives, drawing on work from the decision-theory literature.

This is joint work with Sander Beckers and Hana Chockler.

14:55 – 15:20

Non-monotonic conditionals are expressions of the form “if phi then typically psi”. Various inductive inference operators for drawing inferences from a set of such conditionals have been developed over the years. Syntax splitting is a property of inductive inference operators that ensures we can restrict our attention to parts of the conditional belief base that share atoms with a given query. Recently, this notion has been generalized to conditional syntax splitting, inspired by the notion of conditional independence as known from probability theory. In this talk we will present the properties of unconditional and conditional syntax splitting, give an overview of which inductive inference operators satisfy them, and point to relations with probability theory.

“The problem of tracing the responsibility for unsafe outcomes to decision-making actors in multi-agent systems is urgent. While all existing approaches focus on deterministic outcomes, assuming that (a group of) agents can be held responsible for φ only if φ actually happens and agents could act differently to prevent φ, we find this notion of responsibility insufficient in many scenarios. In this work we combine coalition ability operator [G] form [9] with a probabilistic operator Lα from [6] that allow us to reason about probabilities and their changes. This approach allows us to claim that a group of agents can be held responsible for the unsafe outcome even if this outcome does not actually happen, but the group has caused its probability to be increased to an (unacceptably) high level. The proposed logic could be useful for analysing and assigning responsibility to groups of agents for their risky and unsafe behaviors. Finally, we establish (weak) completeness and decidability results for the proposed logic.”

Conference Dinner

Thursday, August 31

The objective Bayesian approach to inductive logic appeals to the concept of maximal entropy, a version of what Edwin Jaynes’ called the Maximum Entropy Principle. Jaynes’ exploration of this principle was very fertile and in 1978 he took stock, writing the paper ‘Where do we stand on maximum entropy?’ to present his view of the state of the art. The application of Jaynes’ principle to inductive logic has also been very fertile and it is the task of this talk to take stock. I will give a gentle introduction to objective Bayesian inductive logic, explaining how it overcomes some problems with Carnap’s approach to inductive logic. I will then describe a range of recent results, produced in collaboration with Jürgen Landes and Soroush Rafiee Rad. These results touch on features of the logic, its connection to conditionalisation, and its robustness, decidability and applicability to knowledge representation and reasoning.

10:40 – 10:55

In this paper we introduce probabilistic extension of Belnap-Dunn (BD) logic, which was developed as a framework for reasoning with incomplete and/or inconsistent information. We introduce a semantics for this non-classical framework and provide an axiomatization of the resulting non-standard probabilities. We show that this axiomatization is sound and complete with respect to semantics given by probabilistic extension of the semantics for BD logic, and provide a translation to the four-valued setup in the spirit of Dunn 2019 to compare the expressive power of the two frameworks. We next explore inductive learning in this non-classical set ups and discuss several ways of updating non-standard/four-valued probabilities as well as strategies for merging these non-classical probability assignments. Finally we compare non-standard probabilities with other formalization for representing incomplete information, in particular, Dempster-Shafer belief functions.

Levesque introduced the notion of only knowing to precisely capture an agent’s belief in the form of a knowledge base. Many works have been done in the single-agent scenario, including the representation of belief in terms of probability and reasoning about beliefs after actions with noisy effects. In contrast, the multi-agent counterpart has not been sufficiently explored. One major reason is the lack of a logical account, which faithfully reflects Levesque’s intuition about only knowing and lifts it to the multi-agent case, and can deal with multi-agent-related notions such as common knowledge. In this work, we propose a first-order logic with probabilistic belief, only-believing and common belief and study the interaction among these notions. We demonstrate that in our account, an epistemic state represents the only-believing of an agent if and only if it is the maximal state of the same belief. We also study the relationship between what is believed to be common by agents and the actual common knowledge among them.

12:15 – 13:45
Lunch break

A central debate in the philosophy of science concerns the justification of Occam’s razor, the principle that a simplicity preference is conducive to inductive reasoning. In machine learning, there is a parallel and likewise unresolved debate around the question whether statistical learning theory can provide a formal justification for a simplicity preference in machine learning algorithms. In this talk, I will present an epistemological perspective that synthesizes the arguments of the opposing camps in this debate, and yields a qualified means-ends justification of Occam’s razor in statistical learning theory.

This paper develops a trivalent semantics for the truth conditions and the probability of the natural language indicative conditional. This framework yields two logics of conditional reasoning: (i) a logic C of inference from certain premises; and (ii) a logic U of inference from uncertain premises. But whereas C is monotonic for the conditional, U is not, and whereas C obeys Modus Ponens, U does not without restrictions. We show systematic correspondences between trivalent and probabilistic representations of inferences in either framework: especially we show how U recovers and generalizes Adams’s logic of p-valid inference. The result is a unified account of the semantics and epistemology of indicative conditionals and of how we should reason with them.  

15:05 – 15:30

In inductive logic, a split has emerged between Standard Bayesians and Imprecise Bayesians. The latter argue that using sets of probability functions to represent beliefs is a more powerful formalism for modelling epistemological concepts like justification and ignorance.

We investigate this debate using a novel methodology. We create an agent-based model with players based on Standard Bayesian inductive logic and Imprecise Bayesian inductive logic. Our players include a wide variety of decision rules, including Maximin, Isaac Levi’s E-Admissibility rule, and several versions of Leonid Hurwicz’s criterion. We compare the short-run performances of the players in a classic decision problem, based on making decisions about an exchangeable sequence of binomial trials.

Our results reveal the Ignorance Dilemma: the features of Imprecise Bayesianism which make it such an epistemologically powerful representational framework for modelling states of ignorance also cause the players to underperform in many decision problems. Divergent sets of probability functions can represent ignorance, but they create convergence problems, which are not fully ameliorated by their decision rules. We explain the trade-off and discuss some implications for applying Bayesian reasoning to inductive logic problems in AI and epistemology.

This work initiates a study on using probabilistic propositional logic constraints in the structure and parameter learning of probabilistic circuits, a class of tractable probabilistic models allowing for multiple efficient computations. We provide the theoretical grounds for this combination and lay down a first simple attempt to create a model and learning algorithm to combine constraints and data under the proposed framework. Initial experiments on a collection of binary datasets show that this combination of PC and probabilistic propositional logic is achievable (at least under some modeling assumptions), and opens fruitful perspectives for mixing statistical machine learning with logical expert knowledge.

Friday, September 1

One problem to solve in the context of information fusion, decision-making, and other artificial intelligence challenges is to compute justified beliefs based on evidence. In real-life examples, this evidence may be inconsistent, incomplete, or uncertain, making the problem of evidence fusion highly non-trivial. In this talk, I will present a new model for measuring degrees of beliefs based on possibly inconsistent, incomplete, and uncertain evidence, by combining tools from Dempster-Shafer Theory and Topological Models of Evidence. Our belief model is more general than the aforementioned approaches in two important ways: (1) it can reproduce them when appropriate constraints are imposed, and, more notably, (2) it is flexible enough to compute beliefs according to various standards that represent agents’ evidential demands. The latter novelty allows to compute an agent’s (possibly) distinct degrees of belief, based on the same evidence, in situations when, e.g, the agent prioritizes avoiding false negatives and when it prioritizes avoiding false positives. Finally, I will discuss further research directions and, time permitting, report on the computational complexity of computing degrees of belief using the proposed belief model. The main part of the talk is based on joint work with Daira Pinto Prieto and Ronald de Haan. The underlying topological formalism for evidence and belief has been developed in collaboration with Alexandru Baltag, Nick Bezhanishvili, and Sonja Smets.

10:40 – 10:55

In the present work, we propose an account of the probability of a counterfactual conditional in terms of belief functions from Dempster-Shafer Theory. We argue that our account, unlike other proposals, is faithful in that it characterizes exactly the probability of the proposition expressed by a counterfactual conditional, i.e. the sum of the weights of the possible worlds that make that counterfactual true. Our proposal is based upon some logico-algebraic results that establish a logical equivalence between Lewis’ counterfactuals and modal conditionals obeying the principles of conditionals probability. Hence, as a corollary, we show how the probability of a counterfactual can be interpreted as the probability of a necessitated conditional, which is indeed a belief function. We then explore the properties of the belief functions induced by our proposal showing that Lewis’ axioms for counterfactuals force peculiar constraints over the belief function associated with a counterfactual. In the end, we argue how these results provide a faithful probabilistic interpretation of Lewis’ logic of counterfactuals, and, in turn, how Lewis’ logic of counterfactuals axiomatically characterizes a certain class of belief functions.

Propositional probability theory, as here conceived, deals with knowledge representation in the form of probabilistic (p-)valuations of the atomic propositions of a propositional language, of probability (p-)distributions over propositional state descriptions, and with their relation. E.g., each distribution entails a unique valuation, but as a rule not vice versa. For both levels it makes sense, in certain contexts, to speak of the factually true one and hence of singular hypotheses about the true one. To define the degree of truthlikeness of such singular hypotheses, we will use simple, and formally similar, normalized distance measures between valuations and between distributions. All degrees (above and below) are illustrated by data about the co-morbidity of psychiatric disorders (Van Loo, et al. 2015). The paper deals with eight conceptually meaningful ‘upward’ and ‘downward’ conjectures about the relation between comparative truthlikeness judgements on the two levels. The distance measures also suggest plausible degrees of (internal) dependence and disorder of distributions, as well as degrees of disorder and biasedness of valuations. In all cases, they mirror interesting degrees of truthlikeness. We also indicate how the truthlikeness measures can be generalized from singular to disjunctive hypotheses about the actually true valuation and the actually true distribution. In the final section we enlist several further research questions, among which how these degrees are related to propositional probability logics. The paper is highly inspired by the rich paper of Gustavo Cevolani and Roberto Festa (2021).

12:15 – 13:45
Lunch break

Probabilistic algorithms are nowadays very common tools and they are efficiently employed for a great variety of different computational tasks. This has radically changed our perspective on several well-established computational notions. Correctness is probably the most fundamental one. If we consider, indeed, that probabilistic programs are nondeterministic, and thus not supposed to always associate the same output to a given input, it is clear that trying to show that a probabilistic program computes the correct result is an idle endeavour. Nevertheless, we often have quite strong expectations about the frequency with which such a program should return certain outputs. Instead of correctness we can then talk about trustworthiness and understand it as the property enjoyed by a program if the frequency of its outputs matches the probability distribution which is supposed to model its expected behaviour. We present a formal computational framework that formalises this idea. In order to do so, we define a computational calculus endowed with a type theory in which we can simulate the behaviour and interactions of probabilistic programs and compute whether or not their behaviour complies with target probability distributions that model the task that they are supposed to accomplish.

The prototype theory of concepts, originating in the 1970s (e.g., Rosch 1978), is an influential account of conceptual representations that focuses on typical properties of categories, as opposed to definition of category membership. Following Schurz (2012), who proposes to capture the typicality of properties in terms of the conditional probabilities of properties given a category P(P|C), as well as its reverse, P(C|P), which measures diagnosticity, Schuster (2022) developed and empirically validated a probabilistic model of prototypes and of the typicality of category members. Intuitively, typicality is also the foundation of default reasoning: We infer that some bird can presumably fly, because flying is a typical ability of birds. However, Connolly et al. (2007) deny that prototype concepts can be a foundation for default reasoning in humans, because modifiers influence likelihood rating: Subjects find “Handmade cups are used for drinking” less likely than “Cups are used for drinking”. In response, Strößner (2020) investigated to which extend this modifier effect is probabilistically rational, relating it to different types of monotonicity (cautious, rational). Within our talk, we want to bring together the more sophisticated model of probabilistic prototype representation in Schuster (2022) with the considerations on modification and non-monotonic reasoning in Strößner (2020), to give a more complete account of how default reasoning can be based on the information that is stored in prototype concepts.

15:05 – 15:20
Closing words