KG LEGAL \ INFO
BLOG

SOFTWARE AS A SERVICE MODEL – LEGAL ASPECTS AND TAX ISSUES – Delivery, facilitation and electronic interface from the point of view of tax authorities

Publication date: October 23, 2025

Recently, administrative courts have been considering the possibility of imposing VAT obligations on individuals and companies providing services via the SaaS model. This article provides an analysis of the nature of the SaaS model, its advantages and disadvantages, the legal obligations associated with it for both users and service providers and the latest administrative court case law related to this model.

More

LICENSING DATASETS FOR AI TRAINING

Publication date: October 23, 2025

DATASETS AND SYNTHETIC DATA

Creating and developing an AI model requires an unimaginable amount of data. Once input, the model analyzes the information, performs calculations, and draws conclusions based on this data that informs its future operations. Comparing this process to that of humans, one could say that AI “learns” in this way. AI systems are trained on numerous examples and draw model patterns from them, allowing them to predict correct solutions. This process is called “AI training”. Data is currently so expensive and difficult to access that it is estimated that it may be in short supply by 2032. The answer to these problems is synthetic data. This data is generated by the AI itself, which uses parameters from real-world data and randomly generates subsequent scenarios. These scenarios are designed to faithfully reproduce the properties, complexities, and relationships observed in the original data from which they were generated. There are certain risks involved. First of all, it is about creating synthetic data based on erroneous assumptions from real data – with the continuous introduction of new real data, previous errors can be corrected and artificial intelligence will be able to “unlearn” them, whereas if the first synthetic data generated is “contaminated”, each subsequent one will also contain erroneous information. The undoubted advantage of synthetic data is that it can be used to generate subsequent scenarios that may not actually occur at all or very rarely. This is used in industries such as automotive (for simulating traffic scenarios), finance (detecting fraud), and healthcare (detecting and treating rare conditions).

More

Polish CSIRT Cyfra – a new unit to fight cyber threats

Publication date: October 23, 2025

The Polish Ministry of Digital Affairs is establishing a new sectoral cybersecurity incident response team – CSIRT Cyfra. The formal establishment of CSIRT Cyfra is planned for April 2026, with full operational readiness for June 2026. The unit will be responsible for protecting digital infrastructure, and its tasks will include monitoring threats, rapidly responding to attacks, and providing technical support to institutions using digital services.

More

Adversarial AI attacks and regimes of liability

Publication date: October 23, 2025

Adversarial attacks and why are they possible

The capabilities of AI systems are increasingly impressive, largely based on machine learning. This technique allows for the “training” of algorithms by providing vast amounts of data, which in turn leads to the automation of the algorithm and a radical increase in its “cognitive” capabilities, particularly through generalization, drawing conclusions from the obtained data, and predictive language models that allow for highly probable prediction of the next word that fits a given statement or sentence, taking into account its context. Most AI systems are based, to be precise, on a specific variety or technique of machine learning, referred to as an artificial neural network, or deep neural network, a metaphor that approximates the logic of AI systems to the functions of the human brain.

More

Unstructured data in the context of Data Act

Publication date: October 21, 2025

Currently, useful data includes not only specific information organized into rows, columns, or databases, but also data that is not organized in any specifically defined way. This constitutes the majority of data we encounter, including images and text documents such as tweets and blog posts. Thousands of individuals and organizations generate it daily, with little regard for how it can be used. It is precisely thanks to unstructured data that such rapid AI development is possible through machine learning, which involves training algorithms to find patterns and correlations in large data sets.

More

UP