3 Replies Latest reply on Jun 7, 2018 4:08 PM by 1006501

    performance of category definition

    936528 New User

      hi,

      i`ve installed TB 4.2 on Win10. I have checked all Databases and Inventories (to be sure not to miss any data), but after clicking Define under Category Definition it could take hours to find other chemicals.

      (eg, Schiff Base Formers in Protein Binding OECD).

       

      Is there an overwiew which Inventory contains which data? E.g., which Inventory could be unchecked for Skin Sensitization?

       

      Unchecking Databases which are obviously not related to Skin Sens doesn`t accelerate the search.

       

      Thanks,

      stefan

        • Re: performance of category definition
          1006501 New User

          Dear Dr. Onken,

           

          The Databases are collections of structures having experimental data. On the other hand, the Inventories are collection of structures which have not available experimental data.

          So when you search for analogues having observed data you need to check databases, only. Furthermore, the inventories are not preliminary profiled and will take some time when they are profiled for the first time. Once profilers are applied over the inventories, the profiling results will be cached.

           

          If you are interested of all types of skin sensitization data you could just click on the corresponding row of the data matrix (Figure 1) and go to the Data module. All databases having skin sensitization data will be highlighted in green. Then you could select all of the highlighted databases or just some of them.

          Figure 1

           

          If you are interested of a specific skin sensitization data (e.g. EC3), then you could define you target endpoint (Figure 2).

          Figure 2

           

          The more completed the definition of the target endpoint is, the less databases will be highlighted. In case of defined Skin sensitization EC3 (Figure 2), only two databases are highlighted in green (Figure 3). This means that the rest of the databases do not contain data for your endpoint of interest.

           

          Figure 3

           

           

          Kind regards,

          Darina

            • Re: performance of category definition
              936528 New User

              dear Darina,

              thank you a lot.

              It was the first time profiling with the new version, which took a while.

              Now profiling runs a lot faster (seconds even if all Databases and Inventories are checked).

               

              But another question regarding the Inventories: if i want to profile for structural similarity, i should check all Inventories?

               

              regards,

              stefan

                • Re: performance of category definition
                  1006501 New User

                  Dear Stefan,

                   

                  It depends on what you want.

                  If your goal is to find analogues with data you should check databases, only.

                  On the other hand, if you want just to find the closest analogues no matter of data availability, you could check also the inventories.

                  The inventories in TB v.4.2. contain more than 300 000 chemicals, so please, have in mind that the more inventories are checked, the more time will be needed for profiling.

                   

                  Kind regards,

                  Darina