ISSN 1479-4411

First published
in 2003


Electronic Journal of Knowledge Management

   

Paper 1 - Issue 2
   

Home Papers in Current Issue Previous Issues Site Map

    .

Home
About the Journal
Scope
Editorial Board
Submission Guidelines
Call for Papers

 

For information on the European Conference on Knowledge Management, click here

For information on the International Conference on Intellectual Capital, Knowledge Management and Organisational Learning, click here

Downloadable documents on this site require Adobe Acrobat Reader (free download here)

Shared Languages and Shared Knowedge
Rafif Al-Sayed and Khurshid Ahmad
University of Surrey, Guildford, UK
r.sayed@eim.surrey.ac.uk ; k.ahmad@eim.surrey.ac.uk  

   
1.         Introduction

The transfer of knowledge within an organisation, across organisations, between an individual and an organisation, and between individuals is facilitated through a number of sign systems. Such systems include natural languages, mathematical equations, subject specific notations, and other conventions including graphical conventions. The term facilitation is a broad term, however, the key to facilitation is a common consensus on the meanings of words of natural language, kinds of mathematical equations, and agreement on notations and conventions. So, in some respects, the transfer of knowledge requires a consensus amongst organisations and individuals.

Much knowledge management literature has focused on the “sharing” of know-how and expertise through protocols devised by managers (Nonaka and Takeuchi 1995, Davenport and Probst 2002) or the focussed discussion of problems related to the sociology of organisations (Scarbrough 1996). Some have even looked at this problem from a cybernetic point of view in terms of feedback and control systems (Morgan 1996). Management Studies, sociology, and cybernetic models address fairly high-level conceptual issues. However, the surface form of knowledge, the trace of knowledge left behind on a document, whether paper or electronic, is amongst the few discernible forms of knowledge. We will focus on how this trace is transferred.

The long-standing controversy about the relationship between knowledge and language (see Baker and Hacker on Wittgenstein 1988) notwithstanding, it is almost universally true that the development of a subject or the development of a subdomain within a subject discipline invariably leads to the appropriation of certain words from the everyday natural languages of the emergent subject or subdomain workers. Words are given specialist interpretation; words like energy, mass and force existed in the English language prior to Isaac Newton. However after Newton propounded his theory relating to the material nature of being, these three words assumed a more specialist meaning and spawned a whole new discipline, i.e physics. Physicists, initially called natural philosophers, started discussing different kinds of forces, different sources of energy and problems relating to the metrication and instrumentation of quantities related to energy, mass and force. No journal of physics, standard textbooks or encyclopaedias of physics will accept an alternative term for these concepts. There is no obvious coercion but there is a consensus. The consensus is brought about partly through patronage, for instance having a degree in physics will allow one to write a doctoral dissertation or indeed obtain a job in various physics establishments but one has to speak and write in the specialist language of physics. Much the same is true of other disciplines.

We mentioned the development of subdomains within a specialism. Sometimes the subdomain relates specifically to the application of principles and empirical results related to the parent domain. In our times, gene therapy is a good example of such a transfer. Starting from the rather abstract concept of the molecular basis of animal or plant life, originally a theoretical and experimental enterprise variously called biochemistry and molecular biology, one sees the development of industrial methods and instrumentation for extracting and harvesting so-called genetic material – an enterprise now called genetic engineering. From genetic engineering the notion developed that some genetic material can malfunction giving rise to sickness of various organs within an organism; by replacing the defective genetic material, the organ will recover - hence gene therapy. Each of these different subjects i.e. nuclear biology and gene therapy has its own vocabulary and, indeed, writing styles for the discussion of theories and the reportage of experimental results.

Consensus relating to terminology, and elements of other sign systems, is used to show a commitment to certain concepts within a particular domain. This commitment is, in one sense, philosophical, for example Newton’s notion of the material being of nature is a philosophical commitment to materialism articulated through words of the English language which were given specialist meaning. The commitment also relates to the basis of methods and techniques of the new science of the material being – physics – in that Newton chose differential calculus over algebra or geometry to describe the movement of material beings. A series of graphical conventions were adopted for displaying the results of experimental observations and tabulation protocols were set up to show the relationship between two or more variables. There is a third sense of this commitment which relates to the structure of knowledge – also referred to as epistemological commitment – in that Newton argued about the primacy of the three concepts, mass, force and energy, and emphasised that the other physical concepts could be derived from these three. The umbrella term for different kinds of commitment adopted by a domain community at a given time in their genesis relates to the existence of that community and of the ideas propounded by the community. This umbrella term is ontology – the study of the existence of being: the commitments could be called different kinds of ontological commitments.

 

In this paper, we discuss some of the challenges and opportunities related to sharing knowledge between experts and practitioners within a specialist domain and the sharing between the two groups and the potential end-users of the knowledge of the domain or those upon whom the knowledge will have an impact. The case in point here is that of breast cancer therapy. This is an extensively researched topic involving major laboratories and academic departments working on cancer treatment. The results of their deliberations are published in learned journals, written in a formal style for peer-to-peer communication – if you are not an expert or aspiring to be one in oncology or radiation therapy, for example, learned papers in these disciplines will mean very little to you. The knowledge of the experts is refined, related to the knowledge of other experts, and then passed on to the practitioners including cancer therapists working in hospitals, some having close links with the laboratories/departments, and nurses specialising in cancer therapy together with technicians involved in the operation of complex radiotherapy machines, various imaging devices, and/or highly toxic drug treatments. This refined and correlated knowledge is documented in a peer-to-operative language and practitioners themselves write some of the documents. Another important development in recent times has been that of digital libraries and documentation archives that can be accessed through the Internet. Nowadays, the Internet is the first place people go to seek clarification and knowledge related to complex topics; sometimes cancer patients, especially those who have just been diagnosed or about to receive (novel) therapy, tend to consult the Internet. Major cancer charity organisations have devised documents in a language which is more accessible to this new audience. These documents are written in an operative/expert-to-lay person language.

We report on the development of an information spider: a computer program that can allow access to a range of documents, for example learned papers, practice manuals, and fact sheets. The spider not only allows access but helps in creating a text archive and in extracting terms from documents for indexing purposes as well.

2.         Shared concepts, terminology and knowledge spirals

Early literature on knowledge management focused on sharing knowledge related to industrial innovation: there are two well-cited examples of this genre of sharing. The first relates to the development of new product lines by persuading researchers, product designers, manufacturing and sales personnel to work together across departmental and status boundaries (Nonaka and Takeuchi 1995:95-123). The second example relates to the sharing of ‘local innovation’ in the design of usable technology by sharing the knowledge of the end-users of the products (Seely-Brown 1998). Both of these classic examples describe how large organisations used brainstorming methods, and software systems for co-designing and for cross levelling the knowledge within the organisations.

Knowledge sharing in more recent literature stresses more indirect interaction between the constituent members of a (geographically distributed) organisation. For instance, organisations keen on their staff sharing ‘best practices’ typically use a document repository – for example reports of past successful/failed projects, employee, product, and service profiles (e.g. the so-called Yellow Pages) – and tools for inputting and extracting knowledge from such repositories (Davenport and Probst 2002). The range of knowledge sharing systems includes document management systems, systems that manage documents which have been selected and annotated by experts for the use of others (Gibbert, Jonczyk and Völpel 2000), to the ambitiously-titled intelligent systems (Fisher and Ostwald 2001).

Knowledge sharing within a community is a more recent phenomenon and appears to be supported by public-sector organisations. For example, the US National Cancer Institute, a US government agency, is ‘cross levelling’ knowledge across the sub-communities of cancer researchers, cancer-care professionals, and the public at large (Cancer 2003). Again, a document repository is at the heart of the National Cancer Institute’s system. The repository comprises newsletters, fact-files, journal papers, application notes for care workers, information specific to cancer for the public at large, and a glossary of terms.

 

2.1        Intra-organisational knowledge sharing and exchange

Classical knowledge sharing models suggest that the knowledge transfer/sharing process involves the conversion of tacit knowledge into explicit knowledge and vice versa. En route there are processes that help share explicit and implicit knowledge without conversion. These models focus largely on how knowledge is shared within an organisation or intraorganisationally. The sharing of knowledge within an organisation at one level should be part of the natural functioning of the organisation. At another level there are a number of bottlenecks prohibiting this transfer including physical problems of disseminating information, social problems related to prestige and power, and linguistic problems of sharing knowledge across different levels and kinds of expertise. As we show later, interorganisational transfer of knowledge can pose equally severe challenges.

The terms implicit and explicit knowledge are ambiguous and subject to much philosophical debate. For Nonaka and Takeuchi (1995) the conversion of knowledge from implicit to explicit and finally to implicit is the basis of knowledge creation. Choi and Lee (2002) have observed a close relationship between the management strategies of Korean enterprises and the knowledge conversion modes suggested in Nonaka and Takeuchi.

Generally, explicit knowledge is formalised consensually, and is articulated in the language of a specialist domain through texts. These texts are either informative (learned texts) or instructive (instruction manuals). Implicit knowledge is articulated mainly through the spoken word and is suffused with metaphors, similes, and analogies. Implicit knowledge is largely informal and idiosyncratic of individuals. Documents like inter-office memos, product catalogues, advertisements for goods and services, comprise both implicit and explicit knowledge.

The knowledge conversion process involves a close interaction between, and understanding amongst, the key players - the knowledge crew of an organisation: these include the experts, professional workers, including production/marketing/sales staff, researchers and design engineers, the end-users of the artefacts created by the experts and professional workers. The artefacts may include goods and services.

There are four modes of knowledge conversion, according to Nonaka and Takeuchi (1995:71-73), and we discuss these modes with reference to the exchange of terminology and concepts amongst the crew during each of the modes:

(i)         In the SOCIALISATION mode the crew works on an informal basis: verbal exchanges enable the crew to understand each other’s vocabulary.

(ii)         SOCIALISATION is followed by EXTERNALISATION. Here, an inventory of novel, revised, and abolished concepts is produced in a written document;

(iii)        SOCIALISATION and EXTERNALISATION produce fragmented knowledge. The knowledge crew then tends to fuse concepts and terminology in the so-called COMBINATION mode. The fusion is implicit in the development of new methods of working or new products.

(iv)        Once the method and products are established, the crew internalises the operational details, sometimes improving on it and at other times jettisoning some of the new knowledge. This is the INTERNALISATION mode of knowledge transfer. This ultimately leads to SOCIALISATION, EXTERNALISATION and COMBINATION.

The articulated public and consensual development of a shared conceptual system and its vocabulary is more vivid in a loosely-organised setting, e.g. systems for sharing best practice, than in the high-pressured setting as encountered in the creation of a new type of automobile, home bakery (Nonaka and Takeuchi 1995), or smarter and non-intrusive photocopiers (Seely-Brown 1998) where an organisation explicitly plans for a targeted change.

Best practice is shared across an organisation and the recipients of collated/created knowledge are not as well defined as may be the case for design and production engineers sharing the ideas of an architect (product/services) and a marketing expert. Recent developments in knowledge creation are broad-spectrum. This we discuss next.

2.2        Inter-organisational knowledge sharing and exchange

Mergers and acquisitions (M&A) between organisations present a major challenge to knowledge management in that M&A precipitate lasting changes in the participating organisations, and the acquiring organisation undergoes changes when it takes over the other organisation. The example of Siemens’ Information and Communication Mobile (ICM) segment is quite apt here (Kalpers et al 2002).

There are a number of tasks that involve the workers in the two (or more) organisations during a merger and acquisition: Kalpers et al describe the workers as a Business Community: ‘a [geographically and organizationally distributed] group of people who share existing knowledge, create new knowledge, and help one another on the basis of a common interest in a business-related topic’ (2002:197). The Business Community ‘was designed as socio-technical system’ for facilitating the ‘combination of knowledge and the creation of new knowledge’ (ibid:198). The five main activities of the Business Community suggest that the exchange of knowledge is primarily through social interaction and quadri-modal as per Nonaka and Takeuchi (Table 1).

Table 1: Activities of the Business Community and knowledge conversion modes.

Key Activities of the Business Community

Soc

Ext

Comb

Int

Sharing regular events: face-to-face and phone conference

a

 

 

 

Urgent request forum: Discussion forum with email and Net-meeting sessions

a

a

 

 

Information-platform process for knowledge packages and project information

 

a

a

 

Merger and Acquisition (M&A) process improvement work-shops

 

 

a

a

Disseminating information related to M&A projects through information brokering and debriefing

a

 

 

a

 

 

 

 

 

The technical component of the Business Community is an information system that helps in the storage, annotation and retrieval of documents. Kalpers and colleagues talk about K(knowledge) Packs: clearly formatted structures for encapsulating meta-level and summarised contents of documents. The documents can be classified in different facets: (i) according to the type of change – merger, acquisition, divestment; (ii) according to the relevant business process – human resources, logistics, product design; (iii) according to M&A processes and phases - monitoring, evaluation, integration/post closing; (iv) according to IT topics - data, applications, infrastructure, security; and (v) according to the organisational structure of Siemens – group-wide, business-unit wide, region-wide. K-Packs range from informative (contacts, project documentation, laws, contracts) to instructive documents (checklists, documents templates, lessons learnt/annotated histories).

This multi-faceted information platform is called an information spider or an infospider. There is a team of authors and editors involved in providing potentially ‘reusable knowledge’ to this document repository. According to Kalpers et al ‘a sophisticated search engine allows the user to keyword-search (sic) the K-Packs …[and there are facilities] to browse the most popular and often used K-Packs’ (2002:201). The initial evaluation of the Siemens’ M&A Knowledge Exchange (MAKE) appears to be encouraging. What interests us is how the M&A experts built up the knowledge of the mergers and acquisitions business.

3.         Special language and knowledge sharing

The different modes of knowledge conversion help in the articulation, explanation, revision, and acceptance/rejection of key concepts within a group with diverse interests: the players in the group ensure that the terminology they use in articulation and explanation of concepts is clearly understood by others. The group interaction helps the group in achieving a shared understanding of concepts by sharing the terminology of each other. There is anecdotal/case study evidence in Nonaka and Takeuchi suggesting that ‘speaking a common language and having discussions can assemble the power of the group. This is a vital point, even though it takes time to develop a common language’ (1995:99). The development of the understanding of the vocabulary of a specialism is discussed under the rubric of languages for special purpose (LSP) (Sager, Dungworth and MacDonald 1980; Schröder 1991): this subject has an active constituency in Northern Europe and North America as evidenced by academic journals (e.g. Fachsprache). The use of LSP in shaping specialist written knowledge is a subject of debate in pure and applied linguistics (Halliday and Martin 1993; Bazerman 1988). One major area of research in LSP is the growing gulf between language used by experts and by the layperson

3.1        Knowledge exchange and LSP terminology

Any specialist language is a part of the natural language of the authors of specialist texts: ‘Scientific English may be distinctive, but it is still a kind of English, likewise scientific Chinese is a kind of Chinese’ (Halliday and Martin 1993:4). Pejorative remarks that equate specialist talk with obfuscating jargon notwithstanding, specialist languages are an excellent example of parsimony that hallmarks human cognition: a small set of keywords is used to represent a large body of knowledge, or, more specifically, these keywords usually comprise a significant proportion of specialist texts. This parsimony is essential for reducing ambiguity and increasing precision. An even smaller set of single words is used by the community as their (specialist) signature: physicists will write around and about mass, energy, force, time and space, biologists around and about life forms, evolution, heredity, and environment for instance.

The role of shared terminology in knowledge creation is perceptible in the MAKE system. Each K-Pack has associated keywords and MAKE has access to a search engine that presumably makes use of the keywords. Human editors append the keywords to the documents. The editors make a judgement about the suitability of the keywords for a given document and assume that a potential user will be familiar with the keywords. This is a time-consuming and expensive process.

In the following, we outline a method for automatically extracting candidate single word terms and compound terms, for automatically identifying relationships between terms based solely on the behaviour of the candidates in relation to other terms and words used in everyday discourse, the so-called general language discourse. Our method is domain-independent and relies only on a representative but random sample of texts used in a given specialism – cancer care for example – together with a sample of texts used in general language.

3.2        A text-based method for identifying shared knowledge

The introduction, usage, and obsolescence of words in a language is complex and creative. Language experts, particularly lexicographers, have advanced a plausible explanation in relation to the birth, currency, and death of words: they argue that the frequency of a word generally correlates with its acceptability by the language community (Quirk et al 1985). The frequency is computed by examining a collection of written texts (or speech fragments) randomly sampled from a universe of texts. Such sampling is essential especially since the language system is open-ended.

Corpus linguistics is a branch of linguistics where the emphasis is on the use of systematically organised text collections – text corpora or text corpus (singular) – as a starting point of linguistic description or as a means of verifying hypotheses about a language. Machine-readable versions of such collections have been developed for major languages of the world. One major beneficiary of corpus linguistics is lexicography – and many individual dictionary publishers have their own in-house corpora.

The British National Corpus (BNC) of 20th century English language comprises over 100 million words including written text (c. 90%) and speech fragments (10%) (Aston& Barnard 1998). The written component comprises 3,209 texts published mainly between 1975-1993: two-thirds of the texts belong to imaginative genres (novels, literary magazines), the arts, world affairs and leisure, and the other third to natural, pure, applied and social sciences. There are approximately 250,000 unique words including plurals of nouns and verbs in different tenses. Some of the words are used in most texts and most frequently - 6% of the BNC is the word the (6 million instances) - and yet others are used rarely; the word cancer is used 949 times in the BNC, neutron appears 247 times and radionuclide 40 times. Words like ‘the’ and other determiners (a, an), conjunctions (and, but), and prepositions (in, on) are the most frequent and comprise a quarter of the BNC. These are called closed-class words as English-language users seldom invent new determiners or prepositions.

Words belonging to the open-class category, nouns, adjectives, adverbs, are not as frequent. Indeed, amongst the 100 most frequent words in the BNC comprising about half the words in the corpus there are only two nouns, time and people.

3.2.1     Language-related and subject-related signatures

Recall that a specialist writing about his or her domain of specialist knowledge writes in a form of natural language. A specialist document typically has two signatures. The first signature signifies the natural language of the document and the second signifies the special domain.

A corpus-based analysis of a number of individual subject domains, ranging from subjects as diverse as nuclear physics to dance studies, philosophy of science to sewer engineering, theoretical linguistics to cancer research, suggests the existence of the two signatures (Ahmad 2001 and references therein). A corpus was created for each domain usually by keying in a subject name on a search engine and selecting texts of different genres: journal papers, text books, advertisements for goods and services, conference announcements specifically dealing with topics in the domain. The corpora varied from 150,000 words to 750,000 words.

The language-related signature of an English LSP shows itself in the distribution of closed-class words. This distribution is the same as that of the British National Corpus: the first 10 most frequent words in almost each of the domains included determiners, prepositions, and conjunctions. The subject related signature of an LSP is reflected in the profusion of open-class words, mainly nouns, in the 100 most frequent words: in some disciplines as many as 30 nouns comprise the 100 most frequent words and in others about 10 or so.

The most frequent nouns refer to a small group of concepts in the domain: in nuclear physics the 100 most frequent words include the names of key objects of study in nuclear physics - the atomic nucleus, constituent particles of the nucleus, protons and neutrons - and key concepts in physics - energy, force and mass. In linguistics, the 100 most frequent words include the names of the grammatical categories or words, noun, verb, adjective, together with important theoretical notions of transformation, structure and grammar.

The subject-related signature discussed above refers to single words. Specialist language differs more sharply from general language in the usage of compound words, containing as many as six single words. It turns out that the most frequent single words, nucleus and nuclear, are the key ingredients of many of the most frequent compound terms in nuclear physics, i.e., nuclear structure and nuclear reaction, target nucleus, stable/unstable nucleus. 

3.2.2          Automatic identification of terms

It is the profusion of subject-related nouns that distinguishes a special language text from a text written in general language. For example, for one instance of the term nucleus in the BNC there may be as many as 300 instances in a typical nuclear physics corpus – the ratio rising to over 5000 for the plural nuclei.

The ratio of the relative frequency of a word in a specialist corpus and in a general language corpus may suggest whether or not the word is a term. As closed-class words have a similar distribution in the two corpora, the ratio of relative frequencies of these words in the two corpora, one specialist and the other general language, is generally around unity. But the ratio of the relative frequency of subject-related nouns within a specialist text (corpus) to that in the BNC is generally greater than 1 and indicates a candidate term. This ratio is sometimes called the weirdness ratio. The computation of weirdness is the first step in automatic extraction.

3.2.3     Subject-related signatures and knowledge sharing

One example of knowledge sharing is the emergence of an applied science or engineering science around a theoretical subject. The example of nuclear physics (NP) will illustrate this point. The systematic use of nuclear radiation in medicine and agriculture is discussed in the radiation physics (RP) literature. RP is based on key concepts in nuclear physics: concepts that help explain naturally radioactive elements, or unstable elements that emit nuclear radiation, or concepts that describe how stable elements can be made unstable, or radioactive, by bombarding or irradiating these elements with other radiation. The controlled use of emitted radiation is used in radiation therapy or diagnosis. Nuclear (reactor) engineering is a branch of engineering based on the theoretical concepts of nuclear fission in nuclear physics.

The applied sciences and engineering are regulated by law to ensure the safety and well being of humans whilst promoting the use of potentially lethal artefacts like nuclear radiation. Radiation protection/safety has emerged as a discipline following the extensive use of radiation physics.

In order to be autonomous disciplines, both radiation physics and radiation protection have to have their own concepts and associated terminology, a terminology that manifests itself as subject-related signatures. A three-way comparison between the three subjects will show the influences of the parent and the progeny’s own identity. We have created three corpora to study these influences and identity: theoretical nuclear physics (151 texts comprising 444,540 words, published between 1970-1999), radiation physics (91 texts, comprising 286,676 words, published between 2001-2003), and radiation safety (16 texts, comprising 127704 words, published in 2003). The texts are written in American and British English and are drawn from journals, textbooks, public announcements and advertisements.

Table 2 shows the ten most frequent single words in each of the corpora: nuclear physics and radiation physics ‘share’ two key terms: energy and neutron; radiation physics and radiation safety ‘share’ the terms dose and radiation. The other eight terms show the autonomy of the disciplines.

Table 2: Subject-related signatures in three disciplines in physics

Nuclear Physics

Radiation Physics

Radiation Safety

N= 444540

N= 286676

N= 127704

Term

f/N

Term

f/N

Term

f/N

 energy

0.57%

 dose

0.79%

 mutation

0.91%

 nucleus

0.52%

 neutron

0.41%

 dose

0.75%

 neutron

0.41%

 beam

0.40%

 disease

0.60%

 nucleon

0.35%

 radiation

0.33%

 gene

0.59%

 nuclear

0.32%

 energy

0.30%

 radiation

0.57%

 potential

0.32%

 system

0.27%

 risk

0.47%

 target

0.25%

 treatment

0.24%

 rate

0.45%

 scattering

0.24%

 image

0.22%

 exposure

0.32%

 interaction

0.21%

 rays

0.22%

 cancer

0.31%

 mass

0.20%

 detector

0.19%

 radionuclide

0.30%

Total

3.390%

 

3.356%

 

5.254%

Let us now compare the distribution of five of the most frequent terms in each of our corpora and in the BNC (see Table 3). What one sees in the distributions is that the term energy is used 43 and 23 times more frequently in the NP and RP corpora respectively than in the BNC; more demonstrably, the term dose is used 337 and 291 times more in the RP and RS corpora respectively than in the BNC, and the term neutron is used 790, 1379 and 54 times more in NP, RP and RS corpora respectively than in the BNC. The term nucleon, the weirdest in the three corpora, is used only in our nuclear physics corpus.

Table 3: Weirdness ratio for the most frequent open-class words in the three corpora

Nuclear Physics

Radiation Physics

Radiation Safety

N=

444540

N=

286676

N=

127704

Term

fNucPhys/fBNC

Term

fRadPhys/fBNC

Term

fRadSafets/fBNC

 energy

43

 dose

337

 mutation

629

 nucleus

535

 neutron

790

 dose

291

 neutron

790

 beam

218

 disease

50

 nucleon

6402

 radiation

125

 gene

309

nuclear

39

 energy

23

 radiation

409

The 10 subject-related signature terms help (in Table 2) in the formation of compound terms and illustrate the linguistic parsimony and linguistic productivity of specialist writers. The term nucleus is used as a head word for two frequent compound terms, target nucleus and halo nucleus, and the neologism nucleon acts as a modifier for the most frequent compound in our nuclear physics corpus, nucleon-nucleon amplitude. In radiation physics neutron is used as a head word for the frequently occurring thermal neutron, or as a modifier in neutron-capture therapy and the other noun in the noun-noun compound neutron fluence. Radiation acts as a dominant constituent in the radiation safety corpus, as a modifier in radiation exposure and radiation dose, in its derivative form radiological protection, and as a head word in ionizing radiation.  

Table 4: Most frequent compound terms in the three corpora. Terms in italics are neologisms

Nuclear Physics

Radiation Physics

Radiation Safety

nucleon-nucleon amplitude

dose distribution

radiation exposure

neutron star

thermal neutron

congenital abnormalities

nuclear physics

neutron capture therapy

Multi-factorial disease

angular distribution

radiation therapy

ionising radiation

target nucleus

neutron fluence

air concentration

halo nucleus

spatial resolution

genetic disease

nuclear reaction

fluorescence reabsorption

transfer coefficient

nuclear structure

maximum dose

radiological protection

angular momentum

intensity matrix

breast cancer

radioactive beam

radiation physics

radiation dose

The theoretical notion of a structured and composite nucleus, and interaction between the constituents of two nucleons (as in n-n amplitude), shows the physico-philosophical bias of the subject and that of the terms. In radiation physics, the term dose (or the energy of the radiation), and its control, dominate the discussion and show the applied physics/engineering bias of the subject. Radiation safety deals with exposure to the risk of nuclear radiation – hence the most frequent terms radiation exposure, radiation dose and the current interest in breast cancer dominate the discussion in the RS corpus demonstrating the ethico-legal aspect aspects of the subject.

We have attempted to describe how knowledge sharing can be monitored using a text and terminology management system by identifying the subject-related signature of specialist subjects, and particularly how the sharing of terminology across disciplines indicates the sharing of concepts. The explication of knowledge in nuclear physics resulted in the development of radiation physics, and explication of radiation physics knowledge led to the domain of radiation safety. Each of the two explications have led to the internalisation of knowledge which when explicated has its own terminology.

The results in nuclear physics and related disciplines have been replicated in the transfer of knowledge in theoretical solid state physics to electron device engineering (Al-Thubaity and Ahmad 2003); in knowledge transfer from civil engineering to environmental planning systems (Ahmad and Miles 2001); and in a study of how concepts in cognitive psychology and structuralism found their way in theoretical linguistics (Ahmad 2002).

In the next section we discuss how the automatic extraction of terminology for identifying the subject-related signature of a domain, and for identifying its impact on its application/applied domain, can be used to build an information spider semi-automatically. Such a method will facilitate the automatic annotation of key terms for each of the documents and the stronger and weaker cross-referencing between the parent and progeny domains.

Our chosen domain is cancer care where experts are attempting to share their knowledge with professional workers, including therapists, nurses, and radiation workers, and where both experts and professionals are attempting to do the same with increasingly Internet-aware actual or potential cancer patients. Ours is a corpus-based study.

4.         Monitoring and documenting change and differences: A health infospider

Health-care is an all-pervasive domain where advances in medicine and the concomitant costs respectively encourage and discourage the use of new knowledge. In this domain documentation is the ‘main means of communication between care providers’ (Ruch et al 1999) and the effective healthcare delivery systems have become increasingly dependent on accurate and detailed clinical information based on best practices (Chute, Cohn and Campbell 1998).

Knowledge of advances and best practice can be shared and refined by formal knowledge dissemination outlets, for example journal papers, workshops and seminars, and through learning-by-doing during encounters with patients. The Internet facilitates sharing of scientific results either through digital journals or through research notes posted on secure websites relating to drug trials, for example. The widespread use of the Internet has led to potential and actual patients, or their friends and relatives, going online for information after receiving news that the patient is or might be suffering from cancer.

Health-care knowledge has to be shared between many organisations and increasingly that knowledge has to be shared with an open-ended audience. In health-care or its sub-domain cancer care, as in any other specialist domain, terminology management is of the essence: including new terms and expunging old ones. Maintainers of controlled medical vocabularies recognize that such vocabularies are not static (Cimino 1996).

The US National Cancer Institute (NCI) is attempting to provide up-to-date online information on cancer to two groups: health-care professionals and patients. The NCI website provides a facility for searching the contents of its document base; there is also a glossary of cancer terms. The website is organised and is accessible according to different facets: users can look at individual types of cancer, at different types of treatments, and at the results of studies being carried out. Information for professionals is generally in the form of an extended abstract or summary about a specific topic together with an extensive bibliography. References to published journal articles in the bibliography of a given extended abstract are generally hyperlinked to the abstract of the cited article. Information for patients is provided without extensive references to journal articles and is mainly in the form of fact sheets: highlights of a recent diagnostic or therapeutic discovery, of a long-term study and other useful information. In addition to the US NCI, and other national cancer charities like Cancer Research UK, pharmaceutical companies also provide information about their drugs as fact sheets.

4.1        Building a cancer infospider

In order to ascertain the subject-related signature of the language used by experts for cancer-care professionals and for addressing laypersons, especially patients, we have created three text corpora. We are not considering the parent discipline - cancer research - rather focusing on its three progenies to determine the extent to which knowledge is shared between the three progenies by measuring terminological commonalities. In order to illustrate our ideas we have focused on aspects of diagnosis (specifically the breast cancer gene), therapy and after-care of breast cancer patients.

The breast-cancer expert corpus comprised 300 texts, abstracts, and full papers (114,394 words). The texts were collected by navigat